与饼干scrapy验证登录 [英] scrapy authentication login with cookies

查看:148
本文介绍了与饼干scrapy验证登录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是新来的scrapy,并决定尝试一下,因为良好的在线评论。我试图登录到已scrapy网站。我已经成功登录硒的组合,并通过与硒收集所需的cookie和把它们添加到机械化机械化。现在,我试图做scrapy和硒类似的东西,但不能似乎得到任何工作。我真的不能告诉,甚至如果有任何的工作或没有。任何人都可以请帮助我。以下是香港专业教育学院开始。我可能甚至需要与scrapy转移Cookie,但如果东西是永远不会实际登录与否我不能告诉。
谢谢

 从scrapy.spider进口BaseSpider
从scrapy.http进口答复,FormRequest,请求
从scrapy.selector进口HtmlXPathSelector
硒进口的webdriver类MySpider(BaseSpider):
    名称= MySpider
    start_urls = ['http://my_domain.com/']    高清get_cookies(个体经营):
        司机= webdriver.Firefox()
        driver.implicitly_wait(30)
        BASE_URL =htt​​p://www.my_domain.com/
        driver.get(BASE_URL)
        driver.find_element_by_name(用户)。清除()
        driver.find_element_by_name(用户)。send_keys(my_username)
        driver.find_element_by_name(密码)。明确的()
        driver.find_element_by_name(密码)。send_keys(MY_PASSWORD)
        driver.find_element_by_name(提交)。点击()
        饼干= driver.get_cookies()
        driver.close()
        返回饼干    高清解析(个体经营,对此,my_cookies = get_cookies):
        返回请求(URL =htt​​p://my_domain.com/
            饼干= my_cookies,
            回调= self.login)    高清登录(自我,响应):
        返回[FormRequest.from_response(响应,
            表格名称='login_form',
            FORMDATA = {'USER':'my_username','密码':'MY_PASSWORD},
            回调= self.after_login)    高清after_login(个体经营,响应):
        HXS = HtmlXPathSelector(响应)
        打印hxs.select('/ HTML /头/标题')。提取物()


解决方案

您的问题是更多的调试问题,所以我的回答会对你的问题只是一些笔记,没有确切的答案。

 高清解析(个体经营,对此,my_cookies = get_cookies):
    返回请求(URL =htt​​p://my_domain.com/
        饼干= my_cookies,
        回调= self.login)

my_cookies = get_cookies - 你在这里分配的功能,不是返回结果。我觉得你并不需要在这里传递任何函数参数都没有。它应该是:

 高清解析(个体经营,响应):
    返回请求(URL =htt​​p://my_domain.com/
        饼干= self.get_cookies(),
        回调= self.login)

饼干<为请求 / code> 参数应该是一个字典 - 请确认它的确是一个字典


  

我真的不能告诉,甚至如果有任何的工作或没有。


把一些版画回调跟随执行。

i am new to scrapy and decided to try it out because of good online reviews. I am trying to login to a website with scrapy. I have successfully logged in with a combination of selenium and mechanize by collecting the needed cookies with selenium and adding them to mechanize. Now I am trying to do something similar with scrapy and selenium but cant seem to get anything to work. I cant really even tell if anything is working or not. Can anyone please help me. Below is what Ive started on. I may not even need to transfer the cookies with scrapy but i cant tell if the thing ever actually logs in or not. Thanks

from scrapy.spider import BaseSpider
from scrapy.http import Response,FormRequest,Request
from scrapy.selector import HtmlXPathSelector
from selenium import webdriver

class MySpider(BaseSpider):
    name = 'MySpider'
    start_urls = ['http://my_domain.com/']

    def get_cookies(self):
        driver = webdriver.Firefox()
        driver.implicitly_wait(30)
        base_url = "http://www.my_domain.com/"
        driver.get(base_url)
        driver.find_element_by_name("USER").clear()
        driver.find_element_by_name("USER").send_keys("my_username")
        driver.find_element_by_name("PASSWORD").clear()
        driver.find_element_by_name("PASSWORD").send_keys("my_password")
        driver.find_element_by_name("submit").click()
        cookies = driver.get_cookies()
        driver.close()
        return cookies

    def parse(self, response,my_cookies=get_cookies):
        return Request(url="http://my_domain.com/",
            cookies=my_cookies,
            callback=self.login)

    def login(self,response):
        return [FormRequest.from_response(response,
            formname='login_form',
            formdata={'USER': 'my_username', 'PASSWORD': 'my_password'},
            callback=self.after_login)]

    def after_login(self, response):
        hxs = HtmlXPathSelector(response)
        print hxs.select('/html/head/title').extract()

解决方案

Your question is more of debug issue, so my answer will have just some notes on your question, not the exact answer.

def parse(self, response,my_cookies=get_cookies):
    return Request(url="http://my_domain.com/",
        cookies=my_cookies,
        callback=self.login)

my_cookies=get_cookies - you are assigning a function here, not the result it returns. I think you don't need to pass any function here as parameter at all. It should be:

def parse(self, response):
    return Request(url="http://my_domain.com/",
        cookies=self.get_cookies(),
        callback=self.login)

cookies argument for Request should be a dict - please verify it is indeed a dict.

I cant really even tell if anything is working or not.

Put some prints in the callbacks to follow the execution.

这篇关于与饼干scrapy验证登录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆