WebDriver是否有可能通过具有mouseclick事件的元素来单击,该事件会调用包含测试跟踪器的JavaScript文件? [英] Is it possible for WebDriver to click an element with a mouseclick event that invokes a JavaScript file that includes a test tracker?

查看:62
本文介绍了WebDriver是否有可能通过具有mouseclick事件的元素来单击,该事件会调用包含测试跟踪器的JavaScript文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个问题来自

将鼠标悬停在URL上,表明它是 https://webresource.c-ctrip.com/ResUnionOnline/R3/float/floating_normal.min.js?20190306:2 .

我发现文件的开头

  document.getElementById("ab_testing_tracker")&&"abTestValue_Value"!= h?document.getElementById("ab_testing_tracker").value 

因此,我(通过开发人员控制台中的CSS选择器)在网页HTML中搜索ID为"ab_testing_tracker" 的元素,而我对此并不感到惊讶,它什么也不返回.然后我取消缩小并在文件中搜索"ab_testing_tracker"的所有实例.那使我想到了这个要素:

document.getElementsByTagName("body")[0] .insertAdjacentHTML("afterBegin",< input type ='hidden'name ='ab_testing_tracker'id ='ab_testing_tracker'value ='" + h.split("|")[1] +'>")

好吧,看来在文档的 body 中插入了一个隐藏的输入节点,用于自动跟踪.Google搜索显示,自动化跟踪通常是通过查看 navigator.userAgent 属性并找到指示自动化的userAgent来完成的.但是该脚本每次都使用一个随机的合法userAgent,因此我认为userAgent并不是检测硒的方式.

摘要和可能的解决方法

Selenium无法点击网页上的某些元素,这可能是由于网站进行的测试跟踪所致.我想到了几件事它:也许我可以在使用硒时禁用点击事件?我不知道该怎么办,在线搜索后找不到方法.接下来,我尝试用Javascript执行程序单击它,但这没有用.

有人知道绕过测试跟踪器并单击所需元素的方法吗?

从硒导入Webdriver的

 从selenium.webdriver.common.keys导入密钥从selenium.webdriver.firefox.options导入选项从selenium.webdriver.support导入EC的预期条件从selenium.webdriver.support.wait导入WebDriverWait从selenium.common.exceptions导入TimeoutException来自selenium.webdriver.common.by导入方式#网址url ="https://hotels.ctrip.com/hotel/347422.html?isFull=F#ctm_ref=hod_sr_lst_dl_n_1_8"# 用户代理User_Agent_List = ["Mozilla/5.0(Windows NT 6.1)AppleWebKit/537.2(KHTML,如Gecko)Chrome/22.0.1216.0 Safari/537.2","Mozilla/5.0(Windows; U; MSIE 9.0; Windows NT 9.0; zh-CN)","Mozilla/5.0(兼容; MSIE 10.0; Macintosh; Intel Mac OS X 10_7_3; Trident/6.0)","Opera/9.80(X11; Linux i686; U; ru)Presto/2.8.131版本/11.11","Mozilla/5.0(Windows NT 6.1)AppleWebKit/537.2(KHTML,例如Gecko)Chrome/22.0.1216.0 Safari/537.2","Mozilla/5.0(Windows NT 6.2; Win64; x64; rv:16.0.1)Gecko/20121011 Firefox/16.0.1","Mozilla/5.0(iPad; CPU OS 6_0,例如Mac OS X)AppleWebKit/536.26(KHTML,例如Gecko)版本/6.0 Mobile/10A5355d Safari/8536.25"]#定义相关列表分数= []Travel_Types = []Room_Types = []Travel_Dates = []评论= []DEFINE_PAGE = 10def next_page():current_page = int(browser.find_element_by_css_selector('a.current').text)#首先,清除输入框browser.find_element_by_id("cPageNum").clear()打印(清除输入页面")#第二,输入下一页nextPage =当前页+ 1打印('下一页',下一页)browser.find_element_by_id("cPageNum").send_keys(nextPage)#第三,按goto按钮browser.find_element_by_xpath('//* [@ id ="cPageBtn"]').click()def scrap_comments():"它是一项功能,可删除用户评论,分数,房间类型,日期."html = browser.page_source汤= BeautifulSoup(html,"lxml")scores_total = soup.find_all('span',attrs = {"class":"n"})#我们只想要[0],[2],[4],...travel_types = soup.find_all('span',attrs = {"class":"type"})room_types = soup.find_all('a',attrs = {"class":"room J_baseroom_link room_link"})travel_dates = soup.find_all('span',attrs = {"class":"date"})评论= soup.find_all('div',attrs = {"class":"J_commentDetail"})#将分数保存在分数"列表中对于范围内的我(2,len(scores_total),2):Score.append(scores_total [i] .string)Travel_Types.append(travel_types中项目的item.text)Room_Types.append(room_types中的项目的item.text)Travel_Dates.append(用于travel_dates中项目的item.text)Comments.append(item.text.replace('\ n','')用于评论中的项目)如果__name__ =='__main__':#随机选择一个用户代理user_agent = random.choice(User_Agent_List)print('User-Agent:',user_agent)#浏览器选项设置选项=选项()options.add_argument(user_agent)options.add_argument("disable-infobars")#打开Firefox浏览器浏览器= webdriver.Firefox(options = options)browser.get(URL)browser.find_element_by_xpath('//* [@ id ="appd_wrap_close"]').click()页= 1而页面< = DEFINE_PAGE:scrap_comments()下一页()browser.close() 

解决方案

问题不在于跟踪或单击事件,而是时间安排以及可能的浏览器大小.最大化浏览器窗口,并在搜索横幅关闭按钮时添加显式等待

 浏览器= webdriver.Firefox(options = options)browser.maximize_window()browser.get(URL)等待= WebDriverWait(浏览器,10)wait.until(EC.element_to_be_clickable((By.ID,'appd_wrap_close'))).click()wait.until(EC.invisibility_of_element_located((By.ID,'appd_wrap_default'))))current_page = int(browser.find_element_by_css_selector('a.current').text)下一页=当前页+ 1page_number_field = wait.until(EC.visibility_of_element_located((By.ID,'cPageNum')))page_number_field.clear()page_number_field.send_keys(下一页)wait.until(EC.element_to_be_clickable((By.ID,'cPageBtn'))).click() 

This is a question I have that arises from another user's question. If you look at my answer there, you will get some context for this question. The URL for the web page I am going to is https://hotels.ctrip.com/hotel/347422.html?isFull=F#ctm_ref=hod_sr_lst_dl_n_1_8 if you want to check it out for yourself.

Consider the python selenium script found at the bottom of the question. Nothing happens when selenium tries to click on this element:

browser.find_element_by_xpath('//*[@id="cPageBtn"]').click()

Same thing with this element

browser.find_element_by_xpath('//*[@id="appd_wrap_close"]').click()

When debugging my selenium script for each element, I confirmed that selenium can find the element just fine; it is not hidden, inside an iFrame, disabled, or any other oddity that I normally check for failed selenium actions.

However, it has a mouseclick event that invokes an interesting JavaScript file, and I was actually able to access it by navigating to URL shown here:

Hovering over the URL revealed that it is https://webresource.c-ctrip.com/ResUnionOnline/R3/float/floating_normal.min.js?20190306:2.

At the very beginning of the file I found

document.getElementById("ab_testing_tracker") && "abTestValue_Value" != h ? 
document.getElementById("ab_testing_tracker").value

So I search (via CSS selector in the dev console) the webpage HTML for an element with an id of "ab_testing_tracker" and I am less than surprised that it returns nothing. Then I unminified and searched through the file for all instances of "ab_testing_tracker". That led me to this element:

document.getElementsByTagName("body")[0].insertAdjacentHTML("afterBegin","<input type='hidden' name='ab_testing_tracker' id='ab_testing_tracker' value='"+h.split("|")[1]+"'>")

Well, it appears there is a hidden input node inserted in the body of the document for the purpose of automation tracking. Google searching revealed that automation tracking is often accomplished by looking at the navigator.userAgent property and finding userAgents that indicate automation. But the script is using a random legitimate userAgent every time, so I don't think the userAgent is how the detection is finding selenium.

Summary and possible workarounds

Selenium can't click certain elements on the web page likely due to testing tracking by the website. There are a couple things I thought of to get around it: maybe I can disable click events when using selenium? This I don't know how to do and couldn't find a way after searching online. Next, I tried to click on it with a Javascript executor, but that didn't work.

Does anyone know a way to get around the test tracker and click the desired element?

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By

# url
url = "https://hotels.ctrip.com/hotel/347422.html?isFull=F#ctm_ref=hod_sr_lst_dl_n_1_8"

# User Agent
User_Agent_List = ["Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.2 (KHTML, like Gecko) Chrome/22.0.1216.0 Safari/537.2",
                   "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)",
                   "Mozilla/5.0 (compatible; MSIE 10.0; Macintosh; Intel Mac OS X 10_7_3; Trident/6.0)",
                   "Opera/9.80 (X11; Linux i686; U; ru) Presto/2.8.131 Version/11.11",
                   "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.2 (KHTML, like Gecko) Chrome/22.0.1216.0 Safari/537.2",
                   "Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:16.0.1) Gecko/20121011 Firefox/16.0.1",
                   "Mozilla/5.0 (iPad; CPU OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5355d Safari/8536.25"]

# Define the related lists
Score = []
Travel_Types = []
Room_Types = []
Travel_Dates = []
Comments = []

DEFINE_PAGE = 10

def next_page():

    current_page = int(browser.find_element_by_css_selector('a.current').text)

    # First, clear the input box
    browser.find_element_by_id("cPageNum").clear()
    print('Clear the input page')

    # Second, input the next page
    nextPage = current_page + 1
    print('Next page ',nextPage)
    browser.find_element_by_id("cPageNum").send_keys(nextPage)

    # Third, press the goto button
    browser.find_element_by_xpath('//*[@id="cPageBtn"]').click()



def scrap_comments():
    """
    It is a function to scrap User comments, Score, Room types, Dates.
    """
    html = browser.page_source
    soup = BeautifulSoup(html, "lxml")
    scores_total = soup.find_all('span', attrs={"class":"n"})
    # We only want [0], [2], [4], ...
    travel_types = soup.find_all('span', attrs={"class":"type"})
    room_types = soup.find_all('a', attrs={"class":"room J_baseroom_link room_link"})
    travel_dates = soup.find_all('span', attrs={"class":"date"})
    comments = soup.find_all('div', attrs={"class":"J_commentDetail"})
    # Save score in the Score list
    for i in range(2,len(scores_total),2):
        Score.append(scores_total[i].string)
    Travel_Types.append(item.text for item in travel_types)
    Room_Types.append(item.text for item in room_types)
    Travel_Dates.append(item.text for item in travel_dates)
    Comments.append(item.text.replace('\n','') for item in comments)

if __name__ == '__main__':

    # Random choose a user-agent
    user_agent = random.choice(User_Agent_List)
    print('User-Agent: ', user_agent)

    # Browser options setting
    options = Options()
    options.add_argument(user_agent)
    options.add_argument("disable-infobars")

    # Open a Firefox browser
    browser = webdriver.Firefox(options=options)
    browser.get(url)


    browser.find_element_by_xpath('//*[@id="appd_wrap_close"]').click()

    page = 1    
    while page <= DEFINE_PAGE:
        scrap_comments()
        next_page()

    browser.close()

解决方案

The problem is not the tracking or the click event, the problem is timing and possibly browser size. Maximize the browser window and add explicit wait when searching for the banner close button

browser = webdriver.Firefox(options=options)
browser.maximize_window()
browser.get(url)

wait = WebDriverWait(browser, 10)

wait.until(EC.element_to_be_clickable((By.ID, 'appd_wrap_close'))).click()
wait.until(EC.invisibility_of_element_located((By.ID, 'appd_wrap_default')))

current_page = int(browser.find_element_by_css_selector('a.current').text)
next_page = current_page + 1

page_number_field = wait.until(EC.visibility_of_element_located((By.ID, 'cPageNum')))
page_number_field.clear()
page_number_field.send_keys(next_page)
wait.until(EC.element_to_be_clickable((By.ID, 'cPageBtn'))).click()

这篇关于WebDriver是否有可能通过具有mouseclick事件的元素来单击,该事件会调用包含测试跟踪器的JavaScript文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆