Upvote Upvoted 1 Downvote Downvoted
Python urllib2 help
posted in Off Topic
1
#1
0 Frags +

Hello,

I'm using Python's urllib2 module to try and grab the source of ESEA match pages. Unfortunately I get the below error:

urllib2.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Temporarily

I'm pretty sure this is due to ESEA automatically redirecting to their pointless start page if there is no cookies saying you've already seen it. I don't really know enough about HTTP though to truly understand what's happening, nor how to remedy it. Any ideas?

Thanks

Hello,

I'm using Python's urllib2 module to try and grab the source of ESEA match pages. Unfortunately I get the below error:
[code]urllib2.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Temporarily[/code]
I'm pretty sure this is due to ESEA automatically redirecting to their pointless start page if there is no cookies saying you've already seen it. I don't really know enough about HTTP though to truly understand what's happening, nor how to remedy it. Any ideas?

Thanks
2
#2
0 Frags +

had to find this out a while ago for stat screens in our casts. there is a cookie is called "viewed_welcome_page" and the content is 1, with an expiration date of 1 year, that is set when you are done viewing the welcome page. set this cookie before you execute anything else in your script and it should work okay without redirecting you to that stupid page. i'm rusty as fuck when it comes to python though so i might not be able to help you, sorry.

had to find this out a while ago for stat screens in our casts. there is a cookie is called "viewed_welcome_page" and the content is 1, with an expiration date of 1 year, that is set when you are done viewing the welcome page. set this cookie before you execute anything else in your script and it should work okay without redirecting you to that stupid page. i'm rusty as fuck when it comes to python though so i might not be able to help you, sorry.
3
#3
0 Frags +
mthsadhad to find this out a while ago for stat screens in our casts. there is a cookie is called "viewed_welcome_page" and the content is 1, with an expiration date of 1 year, that is set when you are done viewing the welcome page. set this cookie before you execute anything else in your script and it should work okay without redirecting you to that stupid page. i'm rusty as fuck when it comes to python though so i might not be able to help you, sorry.

Yeah, I found that cookie already, I'm just not sure how to set it, and send it with the request. Thanks though.

[quote=mthsad]had to find this out a while ago for stat screens in our casts. there is a cookie is called "viewed_welcome_page" and the content is 1, with an expiration date of 1 year, that is set when you are done viewing the welcome page. set this cookie before you execute anything else in your script and it should work okay without redirecting you to that stupid page. i'm rusty as fuck when it comes to python though so i might not be able to help you, sorry.[/quote]

Yeah, I found that cookie already, I'm just not sure how to set it, and send it with the request. Thanks though.
4
#4
0 Frags +
reillymthsadhad to find this out a while ago for stat screens in our casts. there is a cookie is called "viewed_welcome_page" and the content is 1, with an expiration date of 1 year, that is set when you are done viewing the welcome page. set this cookie before you execute anything else in your script and it should work okay without redirecting you to that stupid page. i'm rusty as fuck when it comes to python though so i might not be able to help you, sorry.
Yeah, I found that cookie already, I'm just not sure how to set it, and send it with the request. Thanks though.

did a bit of stack overflow research and found this. hope it helps you: http://stackoverflow.com/questions/5606083/how-to-set-and-retrieve-cookie-in-http-header-in-python

[quote=reilly][quote=mthsad]had to find this out a while ago for stat screens in our casts. there is a cookie is called "viewed_welcome_page" and the content is 1, with an expiration date of 1 year, that is set when you are done viewing the welcome page. set this cookie before you execute anything else in your script and it should work okay without redirecting you to that stupid page. i'm rusty as fuck when it comes to python though so i might not be able to help you, sorry.[/quote]

Yeah, I found that cookie already, I'm just not sure how to set it, and send it with the request. Thanks though.[/quote]
did a bit of stack overflow research and found this. hope it helps you: http://stackoverflow.com/questions/5606083/how-to-set-and-retrieve-cookie-in-http-header-in-python
5
#5
0 Frags +

Thanks, I actually managed to find a much more elegant solution. I always seem to find solutions right after I post threads.

>>> import urllib2
>>> opener = urllib2.build_opener()
>>> opener.addheaders.append(('Cookie', 'viewed_welcome_page=1'))
>>> f = opener.open('http://play.esea.net/index.php?s=stats&d=match&id=3492240')

Turns out the urllib2 opener has a list/set to hold cookies, which get sent when opening a url.

Thanks for the help.

Thanks, I actually managed to find a much more elegant solution. I always seem to find solutions right after I post threads.

[quote]>>> import urllib2
>>> opener = urllib2.build_opener()
>>> opener.addheaders.append(('Cookie', 'viewed_welcome_page=1'))
>>> f = opener.open('http://play.esea.net/index.php?s=stats&d=match&id=3492240')
[/quote]

Turns out the urllib2 opener has a list/set to hold cookies, which get sent when opening a url.

Thanks for the help.
6
#6
0 Frags +

While you probably have already implemented what you want to do, I highly recommend you check out requests over urllib2. It is a lot easier to do pretty much everything webpage related.

For the specific problem at hand, here is the method.

While you probably have already implemented what you want to do, I highly recommend you check out requests over urllib2. It is a lot easier to do pretty much everything webpage related.

[url=http://www.python-requests.org/en/latest/user/quickstart/#cookies]For the specific problem at hand, here is the method.[/url]
7
#7
0 Frags +

+1 for requests. Run away from urllib. Far, far away.

+1 for requests. Run away from urllib. Far, far away.
Please sign in through STEAM to post a comment.