Python Webscraping Selenium和BeautifulSoup(模态窗口内容) [英] Python Webscraping Selenium and BeautifulSoup (Modal window content)

查看：255 发布时间：2020/5/10 18:38:33 python selenium beautifulsoup modal-dialog selenium-chromedriver

本文介绍了Python Webscraping Selenium和BeautifulSoup(模态窗口内容)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试学习网络爬虫(我是一个新手).我注意到在某些网站上(例如Quora)，当我单击按钮并在屏幕上出现一个新元素时.我似乎无法获得新元素的页面源.我希望能够获取新弹出窗口的页面源并获取所有元素.请注意，您需要拥有Quora帐户才能了解我的问题.

我有一部分代码可供您使用beautifulsoup，Selenium和chromedriver使用:

I am trying to learn webscraping (I am a total novice). I noticed that on some websites (for eg. Quora), when I click a button and a new element comes up on screen. I cannot seem to get the page source of the new element. I want to be able to get the page source of the new popup and get all the elements. Note that you need to have a Quora account in order to understand my problem.

I have a part of a code that you can use using beautifulsoup, selenium and chromedriver:

from selenium import webdriver
from bs4 import BeautifulSoup
from unidecode import unidecode 
import time

sleep = 10
USER_NAME = 'Insert Account name' #Insert Account name here
PASS_WORD = 'Insert Account Password' #Insert Account Password here
url = 'Insert url' 
url2 = ['insert url']
#Logging in to your account
driver = webdriver.Chrome('INSERT PATH TO CHROME DRIVER')
driver.get(url)
page_source=driver.page_source
if 'Continue With Email' in page_source:
    try:
        username = driver.find_element(By.XPATH, '//input[@placeholder="Email"]')
        password = driver.find_element(By.XPATH, '//input[@placeholder="Password"]')
        login= driver.find_element(By.XPATH, '//input[@value="Login"]')
        username.send_keys(USER_NAME)
        password.send_keys(PASS_WORD)
        time.sleep(sleep)
        login.click()
        time.sleep(sleep)
    except:
        print ('Did not work :( .. Try again')
else:
    print ('Did not work :( .. Try different page')

下一部分将转到相关网页，并(尝试")收集有关特定问题关注者的信息.

Next part will go to the concerned webpage and ("try to") collect information about the followers of a particular question.

for url1 in url2:        
    driver.get(url1)
    source = driver.page_source
    soup1 = BeautifulSoup(source,"lxml")  
    Follower_button = soup1.find('a',{'class':'FollowerListModalLink QuestionFollowerListModalLink'})
    Follower_button2 = unidecode(Follower_button.text)
    driver.find_element_by_link_text(Follower_button2).click()

####Does not gives me correct page source in the next line####
    source2=driver.page_source
    soup2=BeautifulSoup(source2,"lxml")

    follower_list = soup2.findAll('div',{'class':'FollowerListModal QuestionFollowerListModal Modal'})
    if len(follower_list)>0:
        print 'It worked :)'
    else:
        print 'Did not work :('

但是，当我尝试获取followers元素的页面源时，最终还是获得了主页而不是follower元素的页面源.谁能帮助我获取弹出的Follower元素的页面源代码?我什么都不来.

However when I try to get the page source of the followers element, I end up getting the page source of the main page rather than the follower element. Can anyone help me to get the page source of the follower element that pops up?? What am I not getting here.

注意: 重新创建或查看我的问题的另一种方法是登录到您的Quora帐户(如果有)，然后与关注者讨论任何问题.如果单击屏幕右下角的关注者按钮，将弹出一个窗口.我的问题本质上是获取此弹出窗口的元素.

NOTE: Another way of recreating or looking at my problem is to log in to your Quora account (if you have one) and then go to any question with followers. If you click the followers button on the lower right side of the screen, that will result in a popup. My problem is essentially to get the elements of this popup.

更新- 好的，我已经阅读了一些，似乎该窗口是一个模态窗口.有人帮我获取模态窗口的内容吗?

Update - Okay so I have been reading a bit and it seems like the window is a modal window. Does anyone help me with getting contents of a modal window?

Python Webscraping Selenium和BeautifulSoup(模态窗口内容) [英] Python Webscraping Selenium and BeautifulSoup (Modal window content)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python Webscraping Selenium和BeautifulSoup(模态窗口内容) [英] Python Webscraping Selenium and BeautifulSoup (Modal window content)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭