避开硒自动化检测 [英] Evade detection of selenium automation

查看:83
本文介绍了避开硒自动化检测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了测试我的技能,我正在编写一个Python软件,该软件应该转到网页 https://www.solebox.com/de_DE ,选择一种产品,然后使用Selenium库将名称,标签和价格保存在 .txt 文件中(或将来将其转换为制鞋机器人).问题在于该站点检测到我正在使用自动软件,并且不允许我访问这些产品.我已经尝试使用 undetected_chromedriver 库,但是它没有用.有人知道工作方法吗?谢谢.

To test my skills I am writing a Python software that should go to the web page https://www.solebox.com/de_DE, select a product and save the name, tag and price in a .txt file (or convert it into a shoe bot in the future) using the Selenium library. The problem is that the site detects that I am using an automated sotware and does not allow me to access the products. I've already tried using the undetected_chromedriver library but it didn't work. Does anyone know a working method? Thank you.

更多信息:操作系统:Windows 10,Chrome版本:88.0.4324.150 64位,Python版本:3.9.1,编写软件:Visual Studio Code

More info: OS: Windows 10, Chrome version: 88.0.4324.150 64 bit , Python version: 3.9.1, Writing software: Visual Studio Code

推荐答案

有多种方法可以避免检测

There are multiple ways to Evade detection of Selenium automation.

代码块:

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_argument('--disable-blink-features=AutomationControlled')

driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get('https://www.solebox.com/de_DE')
print(driver.page_source)

控制台输出:

<!-- =============== This snippet of JavaScript handles fetching the dynamic recommendations from the remote recommendations server
and then makes a call to render the configured template with the returned recommended products: ================= -->

<script>
(function(){
// window.CQuotient is provided on the page by the Analytics code:
var cq = window.CQuotient;
if (cq && ('function' == typeof cq.getCQUserId)
&& ('function' == typeof cq.getCQCookieId)
&& ('function' == typeof cq.getCQHashedEmail)
&& ('function' == typeof cq.getCQHashedLogin)) {
var recommender = '[[&quot;Homepage_Topseller&quot;]]';
// cleaning up the leading/trailing brackets and quotes:
recommender=recommender.slice(8, recommender.length-8);
var separator = '|||';
.
</script>
<script type="text/javascript">//<!--
/* <![CDATA[ (viewProduct-active_data.js) */
dw.ac._capture({id: "01900289", type: "recommendation"});
/* ]]> */
// -->
</script>
.
<script type="text/javascript" id="" src="//static.criteo.net/js/ld/ld.js"></script>
<script type="text/javascript" id="">window.criteo_q=window.criteo_q||[];window.criteo_q.push({event:"setAccount",account:google_tag_manager["GTM-M9TMD24"].macro(24)},{event:"setEmail",email:""},{event:"setSiteType",type:"d"},{event:"viewHome"});</script><div id="criteo-tags-div" style="display: none;"><iframe src="https://gum.criteo.com/syncframe?topUrl=www.solebox.com#{&quot;bundle&quot;:{&quot;origin&quot;:0,&quot;value&quot;:null},&quot;cw&quot;:true,&quot;lwid&quot;:{&quot;origin&quot;:0,&quot;value&quot;:null},&quot;optout&quot;:{&quot;origin&quot;:0,&quot;value&quot;:null},&quot;origin&quot;:&quot;onetag&quot;,&quot;pm&quot;:0,&quot;sid&quot;:{&quot;origin&quot;:0,&quot;value&quot;:null},&quot;tld&quot;:&quot;solebox.com&quot;,&quot;topUrl&quot;:&quot;www.solebox.com&quot;,&quot;uid&quot;:null,&quot;version&quot;:&quot;5_6_2&quot;}" id="criteo-syncframe" width="0" height="0" frameborder="0" style="border-width:0px; margin:0px; display:none" title="Criteo GUM iframe"></iframe></div></body></html>

您可以在硒无法打开中找到相关的详细讨论第二页


使用 undetected_chromedriver

代码块:


Using undetected_chromedriver

Code Block:

import undetected_chromedriver as uc
from selenium import webdriver

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
driver = uc.Chrome(options=options)
driver.get("https://www.solebox.com/de_DE")
print(driver.page_source)

控制台输出:

.
.
<script type="text/javascript" id="">!function(b,e,f,g,a,c,d){b.fbq||(a=b.fbq=function(){a.callMethod?a.callMethod.apply(a,arguments):a.queue.push(arguments)},b._fbq||(b._fbq=a),a.push=a,a.loaded=!0,a.version="2.0",a.queue=[],c=e.createElement(f),c.async=!0,c.src=g,d=e.getElementsByTagName(f)[0],d.parentNode.insertBefore(c,d))}(window,document,"script","https://connect.facebook.net/en_US/fbevents.js");fbq("init",google_tag_manager["GTM-M9TMD24"].macro(19));fbq("track","PageView");</script>
<noscript><img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=238536633211197&amp;ev=PageView&amp;noscript=1"></noscript>

<script type="text/javascript" id="" src="//static.criteo.net/js/ld/ld.js"></script></body></html>

您可以在未检测到的Chromedriver无法正确加载中找到相关的详细讨论

You can find a relevant detailed discussion in Undetected Chromedriver not loading correctly

这篇关于避开硒自动化检测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆