类型错误:预期httplib.Message,拿到<输入“实例”取代。在GAE上使用requests.get(URL)时, [英] TypeError: expected httplib.Message, got <type 'instance'>. when using requests.get(url) on GAE

查看:787
本文介绍了类型错误:预期httplib.Message,拿到<输入“实例”取代。在GAE上使用requests.get(URL)时,的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的目标是建立一个网络爬虫和其托管在GAE。然而,当我试图执行一个非常基本实现我收到以下错误:

My aim is to build a web crawler and host it on GAE. However,when I try to execute a very basic implementation I get the following error:

    Traceback (most recent call last):
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1535, in __call__
    rv = self.handle_exception(request, response, e)
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1529, in __call__
    rv = self.router.dispatch(request, response)
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1278, in default_dispatcher
    return route.handler_adapter(request, response)
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1102, in __call__
    return handler.dispatch()
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 572, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)
  File "E:\WSE_NewsClusteriing\crawler\crawler.py", line 14, in get
    source_code = requests.get(url)
  File "libs\requests\api.py", line 67, in get
    return request('get', url, params=params, **kwargs)
  File "libs\requests\api.py", line 53, in request
    return session.request(method=method, url=url, **kwargs)
  File "libs\requests\sessions.py", line 468, in request
    resp = self.send(prep, **send_kwargs)
  File "libs\requests\sessions.py", line 576, in send
    r = adapter.send(request, **kwargs)
  File "libs\requests\adapters.py", line 376, in send
    timeout=timeout
  File "libs\requests\packages\urllib3\connectionpool.py", line 559, in urlopen
    body=body, headers=headers)
  File "libs\requests\packages\urllib3\connectionpool.py", line 390, in _make_request
    assert_header_parsing(httplib_response.msg)
  File "libs\requests\packages\urllib3\util\response.py", line 49, in assert_header_parsing
    type(headers)))
TypeError: expected httplib.Message, got <type 'instance'>.

我main.py如下:

My main.py is as follows:

import sys
sys.path.insert(0, 'libs')

import webapp2
import requests
from bs4 import BeautifulSoup

class MainPage(webapp2.RequestHandler):
    def get(self):
        self.response.headers['Content-Type'] = 'text/plain'
        url = 'http://www.bbc.com/news/world'
        source_code = requests.get(url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text)
        for link in soup.findAll('a', {'class': 'title-link'}):
            href = 'http://www.bbc.com' + link.get('href')
            self.response.write(href)


app = webapp2.WSGIApplication([
    ('/', MainPage),
], debug=True)

问题是,履带工作正常,作为一个独立的Python应用程序。

The thing is that the crawler works fine as a standalone python application.

有人可以帮助我弄清楚什么是错在这里?并请求模块导致GAE一些兼容性问题?

Can someone help me figure out what's wrong here? Does the requests module cause some compatibility issues with GAE?

推荐答案

我建议不要使用要求库在App Engine上暂时,因为它是没有正式支持的。因此,非常可能遇到的兼容性问题。由于按照网址提取的Python API 文章,支持库包括的urllib 的urllib2 httplib的,并使用网​​址抓取直接。在请求中的某些功能库也可以根据给定的 urllib3 库:// docs.python-requests.org/en/master/dev/authors/#urllib3相对=nofollow>他们的合作。也尚不支持该库。

I would advise against using the requests library on App Engine for the time being as it is not officially supported. It is therefore very likely to encounter compatibility issues. As per the URL Fetch Python API article, supported libraries include urllib, urllib2, httplib and using urlfetch directly. Some features of the requests library may also be based on the urllib3 library given their collaboration. This library is also not yet supported.

随时咨询网址提取为<$简单的例子C $ C>的urllib2 和网​​址抓取请求。如果有一些方法,这些库不为你工作,可随时为我们指出这样你的问题。

Feel free to consult the URL Fetch for simple examples of urllib2 and urlfetch requests. If there's some way that these libraries are not working for you, feel free to point us as such in your question.

这篇关于类型错误:预期httplib.Message,拿到&LT;输入“实例”取代。在GAE上使用requests.get(URL)时,的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆