如何使用Selenium和Python绕过Google CAPTCHA? [英] How can I bypass the Google CAPTCHA with Selenium and Python?

查看:225
本文介绍了如何使用Selenium和Python绕过Google CAPTCHA?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用Selenium和Python绕过Google CAPTCHA?

How can I bypass the Google CAPTCHA using Selenium and Python?

当我尝试抓取某些东西时,Google给了我一个验证码.我可以使用Selenium Python绕过Google CAPTCHA吗?

When I try to scrape something, Google give me a CAPTCHA. Can I bypass the Google CAPTCHA with Selenium Python?

例如,它是Google reCAPTCHA .您可以通过以下链接查看此验证码: https://www.google.com/recaptcha/api2/演示

As an example, it's Google reCAPTCHA. You can see this CAPTCHA via this link: https://www.google.com/recaptcha/api2/demo

推荐答案

从开始使用 CAPTCHA .

To start with using Selenium's Python clients, you should avoid solving/bypass Google CAPTCHA.

Selenium 使浏览器自动化.现在,您要实现的功能完全取决于个人,但是主要是为了通过浏览器客户端自动化Web应用程序以进行测试,并且粗略地讲,它当然不限于此.

Selenium automates browsers. Now, what you want to achieve with that power is entirely up to individuals, but primarily it is for automating web applications through browser clients for testing purposes and of coarse it is certainly not limited to that.

另一方面, CAPTCHA (缩写为 ...完全自动化告诉计算机和人类的公共Turing测试... )是一种用于计算以确定用户是否为人类的挑战-响应测试.

On the other hand, CAPTCHA (the acronym being ...Completely Automated Public Turing test to tell Computers and Humans Apart...) is a type of challenge–response test used in computing to determine if the user is human.

因此, Selenium CAPTCHA 具有两个完全不同的目的,理想情况下,不应将其用于完成任何相互关联的任务.

So, Selenium and CAPTCHA serves two completely different purposes and ideally shouldn't be used to achieve any interrelated tasks.

话虽如此, reCAPTCHA 可以轻松检测网络流量并将您的程序标识为 Selenium 驱动的机器人.

Having said that, reCAPTCHA can easily detect the network traffic and identify your program as a Selenium driven bot.

但是,有一些通用的方法可以避免在抓取网页时被检测到

However, there are some generic approaches to avoid getting detected while web scraping:

  • The first and foremost attribute a website can determine your script/program by is through your monitor size. So it is recommended not to use the conventional Viewport.
  • If you need to send multiple requests to a website, keep on changing the User Agent on each request. Here you can find a detailed discussion on Way to change Google Chrome user agent in Selenium?
  • To simulate humanlike behavior, you may require to slow down the script execution even beyond WebDriverWait and expected_conditions inducing time.sleep(secs). Here you can find a detailed discussion on How to sleep Selenium WebDriver in Python for milliseconds

但是,在几个用例中,我们能够与 reCAPTCHA 进行交互使用 Selenium ,您可以在以下讨论中找到更多详细信息:

However, in a couple of use cases we were able to interact with the reCAPTCHA using Selenium and you can find more details in the following discussions:

您可以在以下位置找到一些相关的讨论:

You can find a couple of related discussion in:

这篇关于如何使用Selenium和Python绕过Google CAPTCHA?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆