如何通过 Python 使用 GeckoDriver 和 Firefox 使 Selenium 脚本无法检测? [英] How can I make a Selenium script undetectable using GeckoDriver and Firefox through Python?

查看：56 发布时间：2021/12/8 15:54:27 python selenium firefox geckodriver selenium-firefoxdriver

本文介绍了如何通过 Python 使用 GeckoDriver 和 Firefox 使 Selenium 脚本无法检测?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

有没有办法使用 geckodriver 使您的 Selenium 脚本在 Python 中无法检测到?

我正在使用 Selenium 进行抓取.我们是否需要使用任何保护措施来使网站无法检测 Selenium?

解决方案

selenium 驱动的 Firefox/GeckoDriver 被检测到 的事实并没有't 取决于任何特定的 GeckoDriver 或 Firefox 版本.网站本身可以检测网络流量，并可以识别浏览器客户端，即Web浏览器为WebDriver控制.

根据

既然 NavigatorAutomationInformation 接口不应该在 WorkerNavigator 上公开.

所以，

webdriver如果设置了 webdriver-active 标志，则返回 true，否则返回 false.

哪里，

navigator.webdriver定义协作用户代理的标准方式，以通知文档它是由 WebDriver 控制的，例如，以便在自动化期间可以触发替代代码路径.

所以，底线是:

<块引用>

Selenium 自我识别

<小时>

然而，一些避免在网页抓取时被检测到的通用方法如下:

网站可以确定您的脚本/程序的首要属性是您的显示器大小.所以建议不要使用传统的视口.
如果您需要向网站发送多个请求，则需要不断更改每个请求的用户代理.在这里您可以找到关于 Way to在 Selenium 中更改 Google Chrome 用户代理?
要模拟类似人类的行为，您可能需要减慢脚本执行速度，甚至超出 WebDriverWait 和 expected_conditions 诱导 time.sleep(secs).在这里你可以找到关于如何睡眠 webdriver 的详细讨论在 python 中以毫秒为单位

Is there a way to make your Selenium script undetectable in Python using geckodriver?


I'm using Selenium for scraping. Are there any protections we need to use so websites can't detect Selenium?
 解决方案 
The fact that selenium driven Firefox / GeckoDriver gets detected doesn't depends on any specific GeckoDriver or Firefox version. The Websites themselves can detect the network traffic and can identify the Browser Client i.e. Web Browser as WebDriver controled.

As per the documentation of the WebDriver Interface in the latest editor's draft of WebDriver - W3C Living Document the webdriver-active flag which is initially set as false, is set to true when the user agent is under remote control i.e. when controlled through Selenium.



Now that the NavigatorAutomationInformation interface should not be exposed on WorkerNavigator.



So,
webdriver
    Returns true if webdriver-active flag is set, false otherwise.
where as,
navigator.webdriver
    Defines a standard way for co-operating user agents to inform the document that it is controlled by WebDriver, for example so that alternate code paths can be triggered during automation.
So, the bottom line is:

  Selenium identifies itself




However some generic approaches to avoid getting detected while web-scraping are as follows:


The first and foremost attribute a website can determine your script/program is through your monitor size. So it is recommended not to use the conventional Viewport.
If you need to send multiple requests to a website, you need to keep on changing the User Agent on each request. Here you can find a detailed discussion on Way to change Google Chrome user agent in Selenium?
To simulate human like behavior you may require to slow down the script execution even beyond WebDriverWait and expected_conditions inducing time.sleep(secs). Here you can find a detailed discussion on How to sleep webdriver in python for milliseconds


                        这篇关于如何通过 Python 使用 GeckoDriver 和 Firefox 使 Selenium 脚本无法检测?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何通过 Python 使用 GeckoDriver 和 Firefox 使 Selenium 脚本无法检测? [英] How can I make a Selenium script undetectable using GeckoDriver and Firefox through Python?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何通过 Python 使用 GeckoDriver 和 Firefox 使 Selenium 脚本无法检测? [英] How can I make a Selenium script undetectable using GeckoDriver and Firefox through Python?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭