如何使用 Selenium 和 Python 绕过 Google CAPTCHA? [英] How can I bypass the Google CAPTCHA with Selenium and Python?

查看:119
本文介绍了如何使用 Selenium 和 Python 绕过 Google CAPTCHA?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用 Selenium 和 Python 绕过 Google CAPTCHA?

How can I bypass the Google CAPTCHA using Selenium and Python?

当我尝试抓取某些东西时,Google 会给我一个 CAPTCHA.我可以使用 Selenium Python 绕过 Google CAPTCHA 吗?

When I try to scrape something, Google give me a CAPTCHA. Can I bypass the Google CAPTCHA with Selenium Python?

例如,它是 Google reCAPTCHA.您可以通过此链接查看此 CAPTCHA:https://www.google.com/recaptcha/api2/演示

As an example, it's Google reCAPTCHA. You can see this CAPTCHA via this link: https://www.google.com/recaptcha/api2/demo

推荐答案

开始使用 SeleniumPython 客户端,您应该避免解决/绕过 Google 验证码.

To start with using Selenium's Python clients, you should avoid solving/bypass Google CAPTCHA.

Selenium 使浏览器自动化.现在,您想用这种能力实现什么完全取决于个人,但主要是为了通过浏览器客户端自动化 Web 应用程序以进行测试,粗略地说,它当然不仅限于此.

Selenium automates browsers. Now, what you want to achieve with that power is entirely up to individuals, but primarily it is for automating web applications through browser clients for testing purposes and of coarse it is certainly not limited to that.

另一方面,CAPTCHA(首字母缩写词是...完全自动化用于区分计算机和人类的公共图灵测试...) 是一种用于计算的挑战-响应测试,用于确定用户是否是人类.

On the other hand, CAPTCHA (the acronym being ...Completely Automated Public Turing test to tell Computers and Humans Apart...) is a type of challenge–response test used in computing to determine if the user is human.

因此,SeleniumCAPTCHA 服务于两个完全不同的目的,理想情况下不应用于实现任何相关的任务.

So, Selenium and CAPTCHA serves two completely different purposes and ideally shouldn't be used to achieve any interrelated tasks.

话虽如此,reCAPTCHA 可以轻松检测网络流量并将您的程序识别为 Seleniumem> 驱动机器人.

Having said that, reCAPTCHA can easily detect the network traffic and identify your program as a Selenium driven bot.

但是,有一些通用的方法可以避免在网页抓取时被检测到:

However, there are some generic approaches to avoid getting detected while web scraping:

  • The first and foremost attribute a website can determine your script/program by is through your monitor size. So it is recommended not to use the conventional Viewport.
  • If you need to send multiple requests to a website, keep on changing the User Agent on each request. Here you can find a detailed discussion on Way to change Google Chrome user agent in Selenium?
  • To simulate humanlike behavior, you may require to slow down the script execution even beyond WebDriverWait and expected_conditions inducing time.sleep(secs). Here you can find a detailed discussion on How to sleep Selenium WebDriver in Python for milliseconds

但是,在几个用例中,我们能够与 reCAPTCHA 进行交互使用 Selenium,您可以在以下讨论中找到更多详细信息:

However, in a couple of use cases we were able to interact with the reCAPTCHA using Selenium and you can find more details in the following discussions:

您可以在以下位置找到一些相关讨论:

You can find a couple of related discussion in:

这篇关于如何使用 Selenium 和 Python 绕过 Google CAPTCHA?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆