使用 selenium 返回动态页面的 html 代码 [英] Return html code of dynamic page using selenium

查看:95
本文介绍了使用 selenium 返回动态页面的 html 代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试抓取此网站,问题是它是动态加载的.

I'm trying to crawl this website, problem is it's dynamically loaded.

基本上我想要的是我可以从浏览器控制台看到的内容,而不是我右键单击 > 显示源时看到的内容.

Basically I want what I can see from the browser console, not what I see when I right click > show sources.

我尝试了一些 selenium 示例,但无法获得所需的内容.下面的代码使用 selenium 并仅获取您在右键单击 -> 显示代码中获得的内容.如何获取加载页面的内容?

I've tried some selenium examples but I can't get what I need. The code below uses selenium and get only what you get in right click -> show code. How can I get the content of the loaded page?

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium import webdriver
import time

# Start the WebDriver and load the page
wd = webdriver.Firefox()
wd.get("https://www.leforem.be/particuliers/offres-emploi-recherche-par-criteres.html?exParfullText=&exPar_search_=true&    exParGeographyEdi=true")

# Wait for the dynamically loaded elements to show up
time.sleep(5)

# And grab the page HTML source
html_page = wd.page_source
wd.quit()

# Now you can use html_page as you like

print(html_page)

推荐答案

需要显式等待搜索结果出现才能获取页面源码:

You need to explicitly wait for the search results to appear before getting the page source:

from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC


wd = webdriver.Firefox()
wd.get("https://www.leforem.be/particuliers/offres-emploi-recherche-par-criteres.html?exParfullText=&exPar_search_=true&    exParGeographyEdi=true")

wd.switch_to.frame("cible")

wait = WebDriverWait(wd, 10)
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, 'td.resultatIntitule')))

print(wd.page_source)

这篇关于使用 selenium 返回动态页面的 html 代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆