urlfetch重定向到python中的无限循环 [英] urlfetch redirected into an infinite loop in python

查看:137
本文介绍了urlfetch重定向到python中的无限循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图加载一个重定向到自己的url。我假设它加载了一个cookie并查找它,但它从来没有看到它,所以有这个无限循环的请求。

我试过urllib2,urlfetch和httplib2。没有工作。



我试过这个:

  url =http:/ / $.ca 
= b $ b opener = urllib2.build_opener(thing,thing2)
url ='http://www.nytimes.com/2005/10/26/business/26fed.html?pagewanted=print'
page = opener.open(url)

这适用于shell,但不适用于Google App Engine。在urlfetch文档中:
http://code.google .com / appengine / docs / python / urlfetch / fetchfunction.html



在follow_redirects下,它表示:
Cookie不会在重定向时处理。如果需要cookie处理,请将follow_redirects设置为False,并手动处理cookie和重定向。



我不知道如何做到这一点,文档看起来不像给任何线索。



我把这个问题搞糊涂了,并且没有像这样的报道的问题可以解决我的问题。

解决方案

多一点解释。很高兴至少可以解释网站的行为:它需要一些cookie,如果没有设置cookie,它会使用cookie设置标题重定向到它自己。你应该阅读一下cookies的工作方式;该网站使用Set-Cookie标头发送cookie,并且浏览器必须在Cookie标头中回显(有一些变化)。 Python有一个用于管理cookie集合的库,cookielib可以帮助你做到这一点。



最好使用原生的urlfetch API;它的返回对象有一个标题对象,它是一个给所有标题的词典(例如Set-Cookie标题)。要发送特定的标题,请使用urlfetch.fetch()函数的headers参数。在这里,您将使用Cookie标头(但请记住,您设置的Cookie标头的格式与您收到的Set-Cookie标头的格式不同 - 这就是cookielib的来源。



祝你好运!

使用curl -v很容易发现网站实际上发送了三个不同的Set-Cookie头。处理所有三个。


I am trying to load a url which redirects to itself. I'm assuming its loading a cookie and its looking for it but it never sees it so there is this infinite loop of requests.

I have tried urllib2, urlfetch, and httplib2. None work.

I tried this though:

url = "http://www.cafebonappetit.com/menu/your-cafe/collins-cmc/cafes/details/50/collins-bistro"
thing = urllib2.HTTPRedirectHandler()
thing2 = urllib2.HTTPCookieProcessor()
opener = urllib2.build_opener(thing, thing2)
url = 'http://www.nytimes.com/2005/10/26/business/26fed.html?pagewanted=print'
page = opener.open(url)

This works in shell, but not on the Google App Engine. In the documentation for urlfetch: http://code.google.com/appengine/docs/python/urlfetch/fetchfunction.html

under follow_redirects, it says: "Cookies are not handled upon redirection. If cookie handling is needed, set follow_redirects to False and handle both cookies and redirects manually."

I have no idea how to do this and the documentation doesn't seem to give any clues either.

I googled the hell out of this issue and there are NO reported issues like this that work for my problem.

解决方案

A little more explanation. Glad that at least the website's behavior is explained: it wants some cookie, and if the cookie isn't set it redirects to itself with a cookie-setting header. You should probably read up on how cookies work; the website sends the cookie using a Set-Cookie header, and the browser must echo it back (with some variations) in a Cookie header. Python has a library for managing collections of cookies, cookielib to help you with this.

It's best to use the native urlfetch API; its return object has a headers object which is a dict giving all the headers (e.g. the Set-Cookie header). To send specific headers, use the headers argument to the urlfetch.fetch() function. Here you will use the Cookie header (but remember that the format of the Cookie header you set is not the same as that of the Set-Cookie header you receive -- that's where cookielib comes in.

Good luck!

PS. Using curl -v it's easy to see that the site actually sends three different Set-Cookie headers. You probably have to deal with all three.

这篇关于urlfetch重定向到python中的无限循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆