曲奇中使用cookie的正确工作形式是什么 [英] What is the correct form of work with cookies in scrapy

查看：109 发布时间：2020/10/9 4:16:01 python cookies xpath scrapy scrapy-spider

本文介绍了曲奇中使用cookie的正确工作形式是什么的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是新手，我正在使用Cookie的网络中使用scrapy，这对我来说是个问题，因为我可以在没有Cookie的情况下获取数据，而在包含Cookie的情况下获取数据是困难的我。
我有此代码结构

I'm very newbie,I am working with scrapy in a web that use cookies, This is a problem for me , because I can obtain data the a web without cookies but obtain the data of a web with cookies is dificult for me. I have this code structure

class mySpider(BaseSpider):
    name='data'
    allowed_domains =[]
    start_urls =["http://...."]

def parse(self, response):
    sel = HtmlXPathSelector(response)
    items = sel.xpath('//*[@id=..............')

    vlrs =[]

    for item in items:
        myItem['img'] = item.xpath('....').extract()
        yield myItem

这很好，我可以使用此代码结构
来获得没有cookie的数据，因为我可以使用cookie，在这个网址中，但我不明白我应该将这段代码放在哪里，然后才能使用xpath来获取数据

This is fine, I can obtain fine the data without cookies using this code structure I found it as I can work with cookies, in this url, but I do not understand where I should put this code to then be able to get the data using xpath

我正在测试此代码

request_with_cookies = Request(url="http://...",cookies={'country': 'UY'})

但我不知道我可以在哪里工作或将这段代码放在哪里，
代码放入函数解析中

but I don't know as I can work or where put this code, I put this code into the function parse, for obtain the data

def parse(self, response):
    request_with_cookies = Request(url="http://.....",cookies={'country':'UY'})

    sel = HtmlXPathSelector(request_with_cookies)
    print request_with_cookies

我尝试将XPath与这个带有cookie的新网址一起使用，以便以后打印此新数据时抓取
，我认为这就像处理一个没有cookie
的url，但是当我运行它时我有一个错误，因为'Request'对象没有属性'body_as_unicode'
使用这些cookie的正确方法是什么，我有点迷失了
非常感谢。

I try of use XPath with this new url with cookies , for later print this new data scraping I thought it was like working with an url without cookies but when I run this I have a mistake because 'Request' object has no attribute 'body_as_unicode' What would be the proper way to work with these cookies, I'm a little lost Thank you very much.

推荐答案

您非常亲密！
parse（）方法的契约是它 Item yield s（或返回一个可迭代的） > s，请求 s，或两者兼而有之。就您而言，您所要做的就是

You are very close! The contract for the parse() method is that it yields (or returns an iterable) of Items, Requests, or a mix of both. In your case, all you should have to do is

yield request_with_cookies

，您的parse（）方法将再次运行 Response 对象，该对象是通过使用

and your parse() method will be run again with a Response object produced from requesting that URL with those cookies.

http://doc.scrapy.org/en/latest/topics/spiders.html?highlight=parse#scrapy.spider.Spider.parse
< a href = http://doc.scrapy.org/en/latest/topics/request-response.html rel = nofollow> http://doc.scrapy.org/en/latest/topics/request- response.html

这篇关于曲奇中使用cookie的正确工作形式是什么的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

曲奇中使用cookie的正确工作形式是什么 [英] What is the correct form of work with cookies in scrapy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

曲奇中使用cookie的正确工作形式是什么 [英] What is the correct form of work with cookies in scrapy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭