硒:浏览器显示的内容与HTML代码不同 [英] Selenium: Browser display is different then HTML code
问题描述
I want to log in to this page with Selenium using Python. But the page displayed in the browser is different from page described in the HTML.Firefox or Chrome webdriver gets the same result.
chromedriver = "./chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chromedriver)
# OR
#driver = webdriver.Firefox()
driver.get('http://www.anb.org/login.htmlurl=%2Farticles%2Fhome.html&ip=94.112.189.79&nocookie=0')
# get screenshot of page
driver.get_screenshot_as_file('./01.png')
#get source code of page
print driver.page_source
不允许发布图片,但是图片与网络浏览器中显示的页面完全相同.
I'm not allowed post the images, but the image is exactly the same as the page displayed in the web-browser.
来自驱动程序的HTML代码:
HTML code from driver:
<html><head>
<title>American National Biography Online</title>
<script>
document.write ("<FRAMESET ROWS=\"103,*\" FRAMEBORDER=0 BORDER=0 FRAMESPACING=0>\n");
document.write (" <FRAME SRC=\"top-home.html\" MARGINWIDTH=0 MARGINHEIGHT=0 SCROLLING=NO>\n");
if (location.search) {
var url = unescape (location.search);
url = (new String(url)).substring(1);
if (url.indexOf ("&") == -1) {
document.write (" <FRAME SRC=\"" + url + "\" MARGINWIDTH=0 MARGINHEIGHT=0>\n");
} else {
document.write (" <FRAME SRC=\"main-home.html" + location.search + "\" MARGINWIDTH=0 MARGINHEIGHT=0>\n");
}
}
else
document.write (" <FRAME SRC=\"main-home.html\" NAME=atop MARGINWIDTH=0 MARGINHEIGHT=0>\n");
document.write ("</FRAMESET>\n");
</script></head>
<frameset rows="103,*" frameborder="0" border="0" framespacing="0">
<frame src="top-home.html" marginwidth="0" marginheight="0" scrolling="NO">
<frame src="main-home.html?url=%2Farticles%2Fbrowse.html&ip=94.112.189.79&nocookie=0" marginwidth="0" marginheight="0">
</frameset>
<noframes>
</noframes>
</html>
如您所见,HTML和图片不匹配.
As you can see, the HTML and the picture do not match.
也许是框架问题?
我的配置:
osx 10.8.5
python 2.7.5
chrome browser 28.0.1500.71
firefox browser 24.0
我安装了最新的chrome/firefox网络驱动程序,但我真的不知道如何查找版本.
I installed the lastest chrome/firefox webdrivers, but I really don't know how to find version.
推荐答案
属性page_source
几乎没有用:它返回服务器发送到浏览器的HTML的第一个版本.它是当前 DOM的不是副本.
The property page_source
is almost useless: It returns the first version of HTML that the server sent to the browser; it's not a copy of the current DOM.
获取副本的最佳方法是使用JavaScript和innerHTML
:
The best way to get a copy is to use JavaScript and innerHTML
:
js_code = "return document.getElementsByTagName('html').innerHTML"
your_elements = sel.execute_script(js_code)
还请注意,innerHTML
不会跨越frame
元素.由于代码中包含框架,因此需要单独检查:
Also note that innerHTML
doesn't span frame
elements. Since you have frames in your code, you need to examine those individually:
frames = driver.find_element_by_tag_name("frame")
js_code = "return arguments[0].innerHTML"
your_elements = sel.execute_script(js_code, frames[0])
您还可以切换到相框:
driver.switch_to_frame("frameName")
之后,所有代码将在此框架的上下文中执行.不要忘记切换回去.
After that, all code will execute within the context of this frame. Don't forget to switch back.
这篇关于硒:浏览器显示的内容与HTML代码不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!