AttributeError:"Response"对象没有适用于python的"body_as_unicode"属性 [英] AttributeError: 'Response' object has no attribute 'body_as_unicode' scrapy for python

查看:230
本文介绍了AttributeError:"Response"对象没有适用于python的"body_as_unicode"属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力应对,并继续收到此消息.

I am working with response in scrapy and keep on getting this message.

我只给出了发生错误的代码段.我正在尝试浏览不同的网页,并且需要获取该特定网页中的页面数.因此,我创建了一个响应对象,在该对象中获取下一个按钮的href,但继续获取AttributeError: 'Response' object has no attribute 'body_as_unicode'

I only gave the snippet where the error is occuring. I am trying to go through different webpages and need get the # of pages in that particular webpage. So I created A response object where I get the href for the next button but keep on getting AttributeError: 'Response' object has no attribute 'body_as_unicode'

与之兼容的代码.

from scrapy.spiders import Spider
from scrapy.selector import Selector
from scrapy.http import Request
from scrapingtest.items import ScrapingTestingItem
from collections import OrderedDict
import json
from scrapy.selector.lxmlsel import HtmlXPathSelector
import csv
import scrapy
from scrapy.http import Response

class scrapingtestspider(Spider):
    name = "scrapytesting"
    allowed_domains = ["tripadvisor.in"]
 #   base_uri = ["tripadvisor.in"]

    def start_requests(self):
        site_array=["http://www.tripadvisor.in/Hotel_Review-g3581633-d2290190-Reviews-Corbett_Treetop_Riverview-Marchula_Jim_Corbett_National_Park_Uttarakhand.html"
                    "http://www.tripadvisor.in/Hotel_Review-g297600-d8029162-Reviews-Daman_Casa_Tesoro-Daman_Daman_and_Diu.html",
                    "http://www.tripadvisor.in/Hotel_Review-g304557-d2519662-Reviews-Darjeeling_Khushalaya_Sterling_Holidays_Resort-Darjeeling_West_Bengal.html",
                    "http://www.tripadvisor.in/Hotel_Review-g319724-d3795261-Reviews-Dharamshala_The_Sanctuary_A_Sterling_Holidays_Resort-Dharamsala_Himachal_Pradesh.html",
                    "http://www.tripadvisor.in/Hotel_Review-g1544623-d8029274-Reviews-Dindi_By_The_Godavari-Nalgonda_Andhra_Pradesh.html"]

        for i in range(len(site_array)):
            response = Response(url=site_array[i])
            sites = Selector(response).xpath('//a[contains(text(), "Next")]/@href').extract()
 #           sites = response.selector.xpath('//a[contains(text(), "Next")]/@href').extract()
            for site in sites:
                yield Request(site_array[i],self.parse)

`

推荐答案

在这种情况下,发生错误的行会期望TextResponse对象不是正常响应.尝试创建一个TextResponse而不是普通的Response来解决该错误.

In this case the line where your error occurs expects a TextResponse object not a normal response. Try to create a TextResponse instead of the normal Response to resolve the error.

此处记录了丢失的方法.

更具体地使用HtmlResponse,因为您的响应将是一些HTML而不是纯文本. HtmlResponseTextResponse的子类,因此它继承了缺少的方法.

More specifically use an HtmlResponse because your response would be some HTML and not plain text. HtmlResponse is a subclass of TextResponse so it inherits the missing method.

还有一件事:您在哪里设置Response的正文?没有任何主体,您的xpath查询将不返回任何内容.就问题中的示例而言,您仅设置URL,但没有设置正文.这就是为什么您的xpath不返回任何内容的原因.

One more thing: where do you set the body of your Response? Without any body your xpath query will return nothing. As far as in the example in your question you only set the URL but no body. This is why your xpath returns nothing.

这篇关于AttributeError:"Response"对象没有适用于python的"body_as_unicode"属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆