如何在 Owler 等网站的自动化中保留登录令牌? [英] How to preserve login tokens in automation for websites like Owler?

查看:27
本文介绍了如何在 Owler 等网站的自动化中保留登录令牌?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为 angel.co 等各种网站开发抓取工具.我一直在为 www.owler.com 网站设计爬虫,因为它需要通过邮件登录,当我们尝试访问有关公司的信息时.

I am trying to develop a scraper for various sites like angel.co. I'm stuck at designing a crawler for the www.owler.com website, as it requires login through mail, when we try to access information about company.

每次登录时,我们都会在电子邮件中获得一个新的登录令牌,该令牌将在一段时间后过期.那么,是否有任何适当的解决方案可以使用带有 Py 绑定的 Selenium 在浏览器会话中保留登录会话?

Each time we login we'll get a new login token on email that will expire after some time. So, is there any proper solution to preserve the login session on the browser session using Selenium with Py-bindings?

我只是在寻找处理此类情况的指南.已经尝试使用 Selenium 自动执行此任务,但这不是一种富有成效的方法.

I'm just looking for guidelines to handle these type of situation. Already tried automating this task using Selenium, but it wasn't a fruitful approach.

推荐答案

老兄!,这可以通过 Selenium 来完成,但这需要一些 Selenium 的高级知识 &基本了解用户如何在网站上进行身份验证cookies.

I got you man! YES, this can be done via Selenium, but it will take some advanced knowledge of Selenium & basic understanding of how users are authenticated on websites & cookies.

在我的脑海里,你有以下选择:

Off the top of my head you have the following options:

  • 1. 存储电子邮件接收的身份验证链接 &将令牌cookie的形式注入到您的浏览器会话中;
  • 2. 以特定于您正在运行测试的浏览器的 Selenium Profile 形式存储您的会话,然后将其加载到您生成的实例上脚本.
  • 1. Storing the email-received authentication link & injecting the token inside it into your browser session in the form of a cookie;
  • 2. Storing your session in the form of a Selenium Profile specific to the browser you're running your tests on and loading it afterwards on the instance spawned by your script.

1.(注意:这从一开始就很有魅力,所以请密切关注.)

1. (Note: This worked like a charm from the first go so follow closely.)

  • 在隐身窗口中打开 www.owler.com(我使用的是 Chrome) 并打开 cookie 部分;
  • 找出您正在使用的 cookie(请参阅this 打印屏幕);
  • Sign In 以接收您的电子邮件.检查登录链接(请参阅 打印屏幕);
  • 复制&将链接加载到另一个浏览器(不是您的隐身会话);
  • 登录后,打开浏览器控制台(F12,或CTRL+Shift+J 在 Chrome 上)> 转到 Applications 标签 > 点击 Cookies 部分(对于 Owler 域)并复制OWLER_PC cookie 的值.(有关详细信息,请参阅 打印屏幕)
  • 在您的匿名会话(未登录)中,转到浏览器控制台并通过 document.cookie 函数以 cookie 的形式添加 auth_token,像这样:document.cookie=OWLER_PC=;
  • 刷新页面2次VOILA,您已登录.
  • Open www.owler.com in an incognito window (I am using Chrome) and open the cookies section;
  • Spot the cookies you are working with (see this print-screen);
  • Sign In in order to receive your email. Inspect the Sign-In link (see this print-screen);
  • Copy & load the link into another browser (not your incognito session);
  • Once you are logged-in, open the browser console (F12, or CTRL+Shift+J on Chrome) > go to Applications tab > click on Cookies section (for the Owler domain) and copy the value of OWLER_PC cookie. (see this print-screen for more details)
  • In your anonymous session (not logged in), go to the browser console and add the auth_token in the form of a cookie, via the document.cookie function, like this: document.cookie=OWLER_PC=<yourTokenHere>;
  • Refresh the page 2 times, and VOILA, you are logged in.

注意:我知道您必须将该 cookie 添加为 OWLER_PC,因为我已经检查了登录会话,这是唯一一个新的 cookie.cookie 的值(通常)与您通过电子邮件收到的身份验证令牌相同.

Note: I knew that you have to add that cookie as OWLER_PC, because I've inspected the logged-in session and that was the only cookie that was new. The cookie's value (usually) is the same as the authentication token you receive via email.

现在剩下要做的就是通过代码模拟这一点.您必须在脚本中存储这些电子邮件身份验证令牌之一(请注意,它们会在 1 年后过期,所以您应该没问题).

Now all that is left to do is simulate this via code. You have to store one of these email authentication tokens in your script (notice they expire in 1 year, so you should be good).

然后,一旦您打开会话,请使用您正在使用的框架/语言的 Selenium 绑定来添加所述 cookie,然后刷新页面.对于 WedriverIO/JavaScript(我选择的武器),它是这样的:

Then once you've opened your session, use the Selenium bindings for the framework/language you are using to add said cookie, then refresh the page. For WedriverIO/JavaScript (my weapons of choice) it goes something like this:

browser.setCookie({name: 'OWLER_PC', value: 'SPF-yNNJSXeXJ...'});
browser.refresh();
browser.refresh();
// Assert you are logged in 

2. 有时,您不想添加 cookie,或者编写样板代码来登录网站,或者在您的 Selenium 上加载一组特定的浏览器扩展驱动程序实例.所以你使用浏览器配置文件.

2. Sometimes, you don't want to add cookies, or write boiler-plate code to just be logged into a website, or have a specific set of browser-extensions loaded on your Selenium driver instance. So you use Browser Profiles.

您必须记录自己,因为这是一个冗长的主题.这个 问题也可能对您有帮助,因为您正在使用 Python Selenium 绑定.

You will have to document yourself on it as it is a lengthy topic. This question might also help you as you are using Python Selenium bindings.

希望这会有所帮助!

这篇关于如何在 Owler 等网站的自动化中保留登录令牌?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆