如何点击“更多”按钮使用selenium对Tripadvisor进行webscraping时的按钮? [英] How do I click on "More" button when webscraping Tripadvisor using selenium?

查看:96
本文介绍了如何点击“更多”按钮使用selenium对Tripadvisor进行webscraping时的按钮?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在Tripadvisor上浏览一页书面评论,但是在点击更多按钮时遇到了困难,该按钮扩展了页面上的所有书面评论。我已经看了类似的查询(谢谢Saurabh Gaur),但是当使用selenium点击按钮时,会弹出这个登录页面。

I'm trying to webscrap a page of written reviews on Tripadvisor, but am encountering difficulties clicking on the "more" button that expands all the written reviews on the page. I've taken a look at similar queries (thank you Saurabh Gaur) but when the button is clicked using selenium this login page pops up.

登录页面照片

有没有办法点击更多按钮而不触发这个?谢谢! :)

Is there a way to click on the "more" button without triggering this? Thank you! :)

from selenium import webdriver
import re
from bs4 import BeautifulSoup

def clicker(url):
    browser = webdriver.Firefox()
    browser.get(url)


    # Use regex to find that button link
    pageSource = browser.page_source
    soup = BeautifulSoup(pageSource, 'html.parser')

    # Example: soup.findAll(True, {'class': re.compile(r'\bclass1\b')})
    Regex = re.compile(r'.*\bmoreLink.ulBlueLinks.*')
    linkElem = soup.find('span', class_=Regex)['class']
    linkElem = '.'.join(linkElem[0:(len(linkElem)+1)])
    moreButton = 'span.' + linkElem

    print(moreButton)

    button = browser.find_element_by_css_selector(moreButton)
    print(button)

    browser.execute_script("arguments[0].click()", button) 

clicker('https://www.tripadvisor.com.sg/Hotel_Review-g295424-d1209362-Reviews-Residence_Spa_at_One_Only_Royal_Mirage_Dubai-Dubai_Emirate_of_Dubai.html')


推荐答案

这是一个示例代码供您参考,您可以使用selenium with phantomjs并单击按钮。我使用了函数find_element_by_name中所需的标签的name属性,你可以根据你的要求修改它。

Here is a sample code for your reference, you can use selenium with phantomjs and click on the button. I have used name attribute of the tag which is required in the function "find_element_by_name", you can modify this according to your requirement.

from urllib.request import urlopen
from urllib.error import HTTPError
from bs4 import BeautifulSoup
from selenium import webdriver
def openUrl(link):
    driver = webdriver.PhantomJS(
                executable_path='../../phantomjs/bin/phantomjs')
            try:
                driver.get(link)
            except HTTPError as e:
                print ('Error opening ' + link)
                continue
            try:
                bsObj = BeautifulSoup(driver.page_source)
            except AttributeError as e:
                return None

            try:
                elem1 = driver.find_element_by_name('checkAndShowAnswers')
                elem1.click()
            except:
                continue

这篇关于如何点击“更多”按钮使用selenium对Tripadvisor进行webscraping时的按钮?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆