硒:浏览器显示的内容与HTML代码不同 [英] Selenium: Browser display is different then HTML code

查看:78
本文介绍了硒:浏览器显示的内容与HTML代码不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要登录此

I want to log in to this page with Selenium using Python. But the page displayed in the browser is different from page described in the HTML.Firefox or Chrome webdriver gets the same result.

chromedriver = "./chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chromedriver)

# OR
#driver = webdriver.Firefox() 


driver.get('http://www.anb.org/login.htmlurl=%2Farticles%2Fhome.html&ip=94.112.189.79&nocookie=0')
# get screenshot of page
driver.get_screenshot_as_file('./01.png')

#get source code of page
print driver.page_source

不允许发布图片,但是图片与网络浏览器中显示的页面完全相同.

I'm not allowed post the images, but the image is exactly the same as the page displayed in the web-browser.

来自驱动程序的HTML代码:

HTML code from driver:

<html><head>
<title>American National Biography Online</title>
<script>
document.write ("<FRAMESET ROWS=\"103,*\" FRAMEBORDER=0 BORDER=0 FRAMESPACING=0>\n");
document.write ("  <FRAME SRC=\"top-home.html\" MARGINWIDTH=0 MARGINHEIGHT=0 SCROLLING=NO>\n");
if (location.search) {
  var url = unescape (location.search);
  url = (new String(url)).substring(1);
  if (url.indexOf ("&") == -1) {
    document.write ("  <FRAME SRC=\"" + url + "\" MARGINWIDTH=0 MARGINHEIGHT=0>\n");
  } else {
    document.write ("  <FRAME SRC=\"main-home.html" + location.search + "\" MARGINWIDTH=0 MARGINHEIGHT=0>\n");
  }
}
else
  document.write ("  <FRAME SRC=\"main-home.html\" NAME=atop MARGINWIDTH=0 MARGINHEIGHT=0>\n");
document.write ("</FRAMESET>\n");
</script></head>
<frameset rows="103,*" frameborder="0" border="0" framespacing="0">
  <frame src="top-home.html" marginwidth="0" marginheight="0" scrolling="NO">
  <frame src="main-home.html?url=%2Farticles%2Fbrowse.html&amp;ip=94.112.189.79&amp;nocookie=0" marginwidth="0" marginheight="0">
</frameset>

<noframes>
</noframes> 
</html>

如您所见,HTML和图片不匹配.

As you can see, the HTML and the picture do not match.

也许是框架问题?

我的配置:

osx 10.8.5
python 2.7.5
chrome browser 28.0.1500.71
firefox browser 24.0

我安装了最新的chrome/firefox网络驱动程序,但我真的不知道如何查找版本.

I installed the lastest chrome/firefox webdrivers, but I really don't know how to find version.

推荐答案

属性page_source几乎没有用:它返回服务器发送到浏览器的HTML的第一个版本.它是当前 DOM的不是副本.

The property page_source is almost useless: It returns the first version of HTML that the server sent to the browser; it's not a copy of the current DOM.

获取副本的最佳方法是使用JavaScript和innerHTML:

The best way to get a copy is to use JavaScript and innerHTML:

js_code = "return document.getElementsByTagName('html').innerHTML"
your_elements = sel.execute_script(js_code)

还请注意,innerHTML不会跨越frame元素.由于代码中包含框架,因此需要单独检查:

Also note that innerHTML doesn't span frame elements. Since you have frames in your code, you need to examine those individually:

frames = driver.find_element_by_tag_name("frame")
js_code = "return arguments[0].innerHTML"
your_elements = sel.execute_script(js_code, frames[0])

您还可以切换到相框:

driver.switch_to_frame("frameName")

之后,所有代码将在此框架的上下文中执行.不要忘记切换回去.

After that, all code will execute within the context of this frame. Don't forget to switch back.

这篇关于硒:浏览器显示的内容与HTML代码不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆