Python Webscraping Selenium 和 BeautifulSoup(模态窗口内容) [英] Python Webscraping Selenium and BeautifulSoup (Modal window content)

查看：27 发布时间：2021/12/23 20:56:13 python selenium beautifulsoup modal-dialog selenium-chromedriver

本文介绍了Python Webscraping Selenium 和 BeautifulSoup(模态窗口内容)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试学习网页抓取(我是一个新手).我注意到在某些网站上(例如 Quora)，当我单击一个按钮时，屏幕上会出现一个新元素.我似乎无法获取新元素的页面源.我希望能够获取新弹出窗口的页面源并获取所有元素.请注意，您需要拥有 Quora 帐户才能了解我的问题.

我有一段代码，您可以使用 beautifulsoup、selenium 和 chromedriver:

I am trying to learn webscraping (I am a total novice). I noticed that on some websites (for eg. Quora), when I click a button and a new element comes up on screen. I cannot seem to get the page source of the new element. I want to be able to get the page source of the new popup and get all the elements. Note that you need to have a Quora account in order to understand my problem.

I have a part of a code that you can use using beautifulsoup, selenium and chromedriver:

from selenium import webdriver
from bs4 import BeautifulSoup
from unidecode import unidecode 
import time

sleep = 10
USER_NAME = 'Insert Account name' #Insert Account name here
PASS_WORD = 'Insert Account Password' #Insert Account Password here
url = 'Insert url' 
url2 = ['insert url']
#Logging in to your account
driver = webdriver.Chrome('INSERT PATH TO CHROME DRIVER')
driver.get(url)
page_source=driver.page_source
if 'Continue With Email' in page_source:
    try:
        username = driver.find_element(By.XPATH, '//input[@placeholder="Email"]')
        password = driver.find_element(By.XPATH, '//input[@placeholder="Password"]')
        login= driver.find_element(By.XPATH, '//input[@value="Login"]')
        username.send_keys(USER_NAME)
        password.send_keys(PASS_WORD)
        time.sleep(sleep)
        login.click()
        time.sleep(sleep)
    except:
        print ('Did not work :( .. Try again')
else:
    print ('Did not work :( .. Try different page')

下一部分将转到相关网页并(尝试")收集有关特定问题的关注者的信息.

Next part will go to the concerned webpage and ("try to") collect information about the followers of a particular question.

for url1 in url2:        
    driver.get(url1)
    source = driver.page_source
    soup1 = BeautifulSoup(source,"lxml")  
    Follower_button = soup1.find('a',{'class':'FollowerListModalLink QuestionFollowerListModalLink'})
    Follower_button2 = unidecode(Follower_button.text)
    driver.find_element_by_link_text(Follower_button2).click()

####Does not gives me correct page source in the next line####
    source2=driver.page_source
    soup2=BeautifulSoup(source2,"lxml")

    follower_list = soup2.findAll('div',{'class':'FollowerListModal QuestionFollowerListModal Modal'})
    if len(follower_list)>0:
        print 'It worked :)'
    else:
        print 'Did not work :('

但是，当我尝试获取 follower 元素的页面源时，我最终获得了主页的页面源，而不是 follower 元素.谁能帮我获取弹出的follower元素的页面源?我没有得到什么.

However when I try to get the page source of the followers element, I end up getting the page source of the main page rather than the follower element. Can anyone help me to get the page source of the follower element that pops up?? What am I not getting here.

注意:重新创建或查看我的问题的另一种方法是登录您的 Quora 帐户(如果您有)，然后向关注者提出任何问题.如果您单击屏幕右下方的关注者按钮，则会弹出一个窗口.我的问题本质上是获取此弹出窗口的元素.

NOTE: Another way of recreating or looking at my problem is to log in to your Quora account (if you have one) and then go to any question with followers. If you click the followers button on the lower right side of the screen, that will result in a popup. My problem is essentially to get the elements of this popup.

更新 -好的，所以我一直在阅读，看起来该窗口是一个模态窗口.有没有人帮我获取模态窗口的内容?

Update - Okay so I have been reading a bit and it seems like the window is a modal window. Does anyone help me with getting contents of a modal window?

Python Webscraping Selenium 和 BeautifulSoup(模态窗口内容) [英] Python Webscraping Selenium and BeautifulSoup (Modal window content)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python Webscraping Selenium 和 BeautifulSoup(模态窗口内容) [英] Python Webscraping Selenium and BeautifulSoup (Modal window content)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭