Scrapy:进行表单登录，然后使用该会话 [英] Scrapy: Do form login and then work with that session

查看：55 发布时间：2021/7/17 18:32:11 python scrapy

本文介绍了Scrapy:进行表单登录，然后使用该会话的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试执行以下操作:

I'm trying to do the following:

登录网页(在我的例子中是 zendesk.com)
使用该会话执行一些发布请求

事实上，zendesk 遗漏了一些 api(创建/更改宏)，我现在需要模拟浏览器会话.

In fact zendesk misses some apis (create/alter macros) which I now need to simulate simulating a browser session.

所以我不是在写蜘蛛，而是在我的脚本进行时尝试与网站进行交互.发布请求从一开始就不知道，而只在我的脚本中.

So I'm not writing a spider but try to interact with the website as my script proceeds. The post requests are not known from the start but only during my script.

在 Scrapy docs，有以下示例说明如何在 Scrapy 中使用经过身份验证的会话:

In the Scrapy docs, there is the following example to illustrate how to use an authenticated session in Scrapy:

class LoginSpider(BaseSpider):
  name = 'example.com'
  start_urls = ['http://www.example.com/users/login.php']

  def parse(self, response):
    return [FormRequest.from_response(response,
                formdata={'username': 'john', 'password': 'secret'},
                callback=self.after_login)]

  def after_login(self, response):
    # check login succeed before going on
    if "authentication failed" in response.body:
        self.log("Login failed", level=log.ERROR)
        return

    # continue scraping with authenticated session...

但看起来这仅适用于抓取，但在我的情况下，我只想保持"会话并进一步处理该会话.有没有办法用scrapy来实现这一点，或者是否有更适合这项任务的工具?

But it looks like this only works for scraping, but in my case I just want to "hold" the session and further work with that session. Is there a way to achieve this with scrapy, or are there tools that better fit this task?

推荐答案

非常感谢 @wawaruk.根据您链接的stackoverflow帖子，这是我想出的解决方案:

Thanks a lot @wawaruk. Based on the stackoverflow post you linked that's the solution I came up with:

import urllib, urllib2, cookielib, re

zendesk_subdomain = 'mysub'
zendesk_username = '...'
zendesk_password = '...'

cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
resp = opener.open('http://%s.zendesk.com/access/unauthenticated' % (zendesk_subdomain))
s = resp.read()
data = dict()
data['authenticity_token'] = re.findall('<input name="authenticity_token" type="hidden" value="([^"]+)"', s)[0]
data['return_to'] = 'http://%s.zendesk.com/login' % zendesk_subdomain
data['user[email]'] = zendesk_username
data['user[password]'] = zendesk_password
data['commit'] = 'Log in'
data['remember_me'] = '1'

opener.open('https://localch.zendesk.com/access/login', urllib.urlencode(data))

从那里可以使用 opener 访问所有页面，例如

from there with opener all pages can be accessed, e.g.

opener.open('http://%s.zendesk.com/rules/new?filter=macro' % zendesk_subdomain)

这篇关于Scrapy:进行表单登录，然后使用该会话的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Scrapy:进行表单登录，然后使用该会话 [英] Scrapy: Do form login and then work with that session

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Scrapy:进行表单登录，然后使用该会话 [英] Scrapy: Do form login and then work with that session

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭