scrapy: exceptions.AttributeError: 'unicode' 对象没有属性 'dont_filter' [英] scrapy: exceptions.AttributeError: 'unicode' object has no attribute 'dont_filter'

查看:37
本文介绍了scrapy: exceptions.AttributeError: 'unicode' 对象没有属性 'dont_filter'的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在scrapy 中,我收到错误exceptions.AttributeError: 'unicode' object has no attribute 'dont_filter'.四处搜索后,我找到了 this 答案(这是有道理的,因为这是我在收到错误之前修改的唯一代码) 据此我修改了我的代码.我更改了 start_request 以生成列表中的值,而不是将其整个重新调整,但我仍然得到它.有什么想法吗?

In scrapy, I am getting the error exceptions.AttributeError: 'unicode' object has no attribute 'dont_filter'. After searching around, I found this answer (which made sense as it was the only bit of code I modified before getting the error) according to which I modified my code. I changed start_request to yield values in the list instead of retruning it whole but I'm still getting it. Any ideas?

def start_requests(self):
    connection = pymongo.Connection(settings['MONGODB_SERVER'],
                                    settings['MONGODB_PORT'])
    db = connection[settings['MONGODB_DB']]
    collection = db[settings['MONGODB_COLLECTION']]
    for el in [i['url'] for i in collection.find({}, {'_id':0, 'url':1})]:
        yield el

我已经检查了代码的其他部分,以确认其他一切都没有问题.

I have checked the other parts of the code to affirm that everything else is fine.

追溯:

[-] Unhandled Error
    Traceback (most recent call last):
      File "/home/myName/scrapy-test/venv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 93, in start
        self.start_reactor()
      File "/home/myName/scrapy-test/venv/local/lib/python2.7/site-packages/scrapy/crawler.py", line 130, in start_reactor
        reactor.run(installSignalHandlers=False)  # blocking call
      File "/home/myName/scrapy-test/venv/local/lib/python2.7/site-packages/twisted/internet/base.py", line 1192, in run
        self.mainLoop()
      File "/home/myName/scrapy-test/venv/local/lib/python2.7/site-packages/twisted/internet/base.py", line 1201, in mainLoop
        self.runUntilCurrent()
    --- <exception caught here> ---
      File "/home/myName/scrapy-test/venv/local/lib/python2.7/site-packages/twisted/internet/base.py", line 824, in runUntilCurrent
        call.func(*call.args, **call.kw)
      File "/home/myName/scrapy-test/venv/local/lib/python2.7/site-packages/scrapy/utils/reactor.py", line 41, in __call__
        return self._func(*self._a, **self._kw)
      File "/home/myName/scrapy-test/venv/local/lib/python2.7/site-packages/scrapy/core/engine.py", line 120, in _next_request
        self.crawl(request, spider)
      File "/home/myName/scrapy-test/venv/local/lib/python2.7/site-packages/scrapy/core/engine.py", line 176, in crawl
        self.schedule(request, spider)
      File "/home/myName/scrapy-test/venv/local/lib/python2.7/site-packages/scrapy/core/engine.py", line 182, in schedule
        return self.slot.scheduler.enqueue_request(request)
      File "/home/myName/scrapy-test/venv/local/lib/python2.7/site-packages/scrapy/core/scheduler.py", line 48, in enqueue_request
        if not request.dont_filter and self.df.request_seen(request):
    exceptions.AttributeError: 'unicode' object has no attribute 'dont_filter'

推荐答案

start_requests 应该产生单独的 Request 对象,而不仅仅是单独的 URL.但是代码中的每个 el 显然都是一个 URL.尝试改变

start_requests is supposed to yield individual Request objects, not just individual URLs. But each el in your code is apparently a URL. Try changing

yield el

yield self.make_requests_from_url(el)

(请参阅您链接到的问题的示例)

(see the question you link to for an example of this)

这篇关于scrapy: exceptions.AttributeError: 'unicode' 对象没有属性 'dont_filter'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆