有没有什么快速的方法可以检查scrapy是否登录网站成功? [英] Is there any quick way to check whether scrapy login a website successfully?

查看:55
本文介绍了有没有什么快速的方法可以检查scrapy是否登录网站成功?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 Scrapy 登录 Github.

I am trying to use Scrapy to login Github.

# -*- coding: utf-8 -*-
import scrapy

class AutoreplySpider(scrapy.Spider):
    name = 'AutoLogin'
    allowed_domains = ['github.com']
    start_urls = ['https://github.com/login']

    def parse(self, response):
        return scrapy.FormRequest.from_response(
            response,
            formdata={
                'login': 'ac',
                'password': 'pw'
            },
            callback=self.reply
        )

    def after_login(self, response):
        pass

当我手动登录 Github 时,我选中了记住用户名和密码"这样的框.所以如果我不退出,当我再次访问Github时应该会自动登录.我在终端中运行脚本,它没有出现任何错误.但是,当我访问 Github 时,它要求我登录.我不确定我的代码是否有效.我有一段时间没有碰 Scrapy.有没有什么快速的方法可以检查我是否登录成功?谢谢!

When I logged in Github manually, I checked the box like "remember username and password". So if I don't log out, it should be automatically login when I visit Github again. I ran the script in terminal and it didn't come up with any error. However, when I visit Github, it requires me to log in. I'm not sure if my code works. I didn't touch Scrapy for a while. Is there any quick way to check if I am logged in successfully? Thank you!

推荐答案

代码不正确.表单通常具有隐藏字段.当您将凭证数据发送到服务器时,服务器将检查这些字段.我添加循环来收集所有输入标签字段.当表单部分正确时,可以在响应页面中找到帐户名称.如果它存在,你可以继续.

Code is incorrect. Often forms have hidden fields. Server will check thus fields when you send credential data to server. I add loop to collect all input tag fields. When form part is correct it's possible to find account name in response page. If it exists you can go ahead.

class AutologinSpider(scrapy.Spider):
    name = 'AutoLogin'
    allowed_domains = ['DOMAIN_TO_LOGIN_COM']
    start_urls = ['URP_OF_FORM_PAGE']
    custom_settings = {'ROBOTSTXT_OBEY': False}

    def parse(self, response):
        inputs = response.css('form input')

        formdata = {}
        for input in inputs:
            name = input.css('::attr(name)').extract_first()
            value = input.css('::attr(value)').extract_first()
            formdata[name] = value

        formdata['login'] = 'YOUR_LOGIN'
        formdata['password'] = 'YOUR_PASSWORD'

        return scrapy.FormRequest.from_response(
            response,
            formdata=formdata,
            callback=self.after_login
        )

    def after_login(self, response):
        if not response.css('ul.dropdown-menu li strong::text').extract_first() == 'YOU_ACCOUNT_NAME':
            # Something wrong.
            pass
    # You have successfully logged in. Put you code here.
    pass

这篇关于有没有什么快速的方法可以检查scrapy是否登录网站成功?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆