如何获取网页中的特定框架并检索其内容 [英] How to get a specific frame in a web page and retrieve its content
问题描述
我想访问下列网址的翻译结果
翻译显示在两个框架中的底部内容框架中。我有兴趣只检索底部内容框架以获取翻译。
selenium for python允许我们通过网页自动化获取页面内容:
browser.get('http://translate.google.com/#en/ar/'+hurl)
所需的框架是一个iframe:
< div id =contentframestyle =top:160px>< iframe src =/ translate_p?hl = en& am ... name = c frameborder =0style =height:100%; width:100%; position:absolute; top:0px; bottom:0px;>< / div>< / iframe>
但是如何获得底部内容框架元素来检索使用web自动翻译?
知道PyQuery也允许我们使用JQuery形式化浏览内容
更新: Selenium提供了一种方法,你可以做到这一点。 但在上例中不起作用。它可以使用 但上面的代码是用Java编写的。在Python中,您可以使用下面的行。 更新 输出: SaltyCrane ??????? 我刚刚尝试打印存在于iframe中的标题名称SaltCrane。 上面的代码是用Java编写的。同样的逻辑也应该在Python中工作。 I wanted to access the translation results of the following url the translation is displayed in the bottom content frame out of the two frames. I am interested in retrieving only the bottom content frame to get the translations selenium for python allows us to fetch page contents via web automation: The required frame is an iframe : but how to get the bottom content frame element to retrieve the translations using web automation? Came to know that PyQuery also allows us to browse the contents using the JQuery formalism Update: An answer mentioned that Selenium provides a method where you can do that. but it does not work in the above example. It returns an empty page . You can use But the above code is in Java. In Python, you can use the below line. UPDATE Output: SaltyCrane ??????? I have just tried to print the title name SaltCrane that is present inside the iframe.
It worked for me except for the ? symbols after the SaltCrane. As it was arabic, it was unable to decode the same. The above code is in Java. Same logic should also work in Python. 这篇关于如何获取网页中的特定框架并检索其内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
frame = browser.find _element_by_tag_name('iframe')
browser.switch_to_frame(frame)
#get page source
browser.page_source
driver.switchTo.frame(1); / code>这里,frame()中的数字1是网页中存在的帧的索引。因为你的要求是切换到第二帧并且索引从0开始,所以你应该使用
driver.switchTo.frame(1);
driver.switch_to_frame(1);
driver.get(http://translate.google.com/translate?hl=zh-CN&sl=en&tl=ar&u=http://www.saltycrane。 COM /博客/ 2008/10 /如何逃生%的编码-URL的Python /);
driver.switchTo().frame(0);
System.out.println(driver.findElement(By.xpath(/ html / body / div / div / div [3] / h1 / span / a))。getText());
它为我工作,除了? SaltCrane后的符号。因为它是阿拉伯语,所以无法解码。
browser.get('http://translate.google.com/#en/ar/'+hurl)
<div id="contentframe" style="top:160px"><iframe src="/translate_p?hl=en&am... name=c frameborder="0" style="height:100%;width:100%;position:absolute;top:0px;bottom:0px;"></div></iframe>
frame = browser.find_element_by_tag_name('iframe')
browser.switch_to_frame(frame)
# get page source
browser.page_source
driver.switchTo.frame(1);
here, the digit 1 inside frame() is the index of frames present in the webpage. as your requirement is to switch to second frame and the index starts with 0, you should use driver.switchTo.frame(1);
driver.switch_to_frame(1);
driver.get("http://translate.google.com/translate?hl=en&sl=en&tl=ar&u=http://www.saltycrane.com/blog/2008/10/how-escape-percent-encode-url-python/");
driver.switchTo().frame(0);
System.out.println(driver.findElement(By.xpath("/html/body/div/div/div[3]/h1/span/a")).getText());