寻找当前页面的元素或源文本 [英] looking for text of an element or source of current page

查看:170
本文介绍了寻找当前页面的元素或源文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我打开一些网页,我需要检查一个特定的字符串 - 如果存在的话,这意味着它是一个很好的解析页面。



我正在寻找的是一个类似于h2的元素:

 < h2 class =page_title>幻想世界:中世纪的房子< / h2> 

如果缺少这个h2,我知道我不需要处理它,只需返回获得下一行。

在代码中,我有一个try / exception / else块来查找这个短语,如果它看到它,它会传递到下一个部分的功能。如果没有的话,应该去else,这就告诉它返回。

在这个测试中有两个页面调用 - 第一个有这个短语,第二个没有。



打开第一页,并通过测试。

第二页打开,我得到一个异常报告 - 但它永远不会返回到主调用代码...它只是停止。

为什么不是例外的正常路径返回? / b>

以下是代码:

 #!/ usr / bin / env python 
$ b从selenium导入webdriver
从selenium.webdriver导入Firefox作为浏览器
从selenium.webdriver.support.ui导入WebDriverWait


browser = webdriver.Firefox()
$ b def call_productpage(productlink):
全球浏览器

print'in call_productpage('+ productlink +')'
browser.get(productlink)
browser.implicitly_wait(8)

< div class =page_content>
product_block = browser.find_element_by_xpath(// div [@ class ='page_content']);

#< h2 class =page_title>幻想世界:中世纪的房子< / h2>
try:
product_name = product_block.find_element_by_xpath(// h2 [@ class ='page_title']);
除了异常,错误:
#printFailed!\\\
Error(%s):%s%(err .__ class __.__ name__,err)
print'return to main() '
return 0
else:
nameStr = str(product_name.text)
print'product_name:'+ nameStr
finally:
printtest over !
return 1

test1 = call_productpage('https://www.daz3d.com/i/3d-models/-/desk-clocks?spmeta=ov&item=12657')
if test1:
print'\\\
test 1 go OK \\\
'
else:
print'\\\
test 1 does not go OK\\\
'

tes2 = call_productpage('https://www.daz3d.com/i/3d-models/-/dierdre-character-pack?spmeta=ov&item=397')
如果test2:
print'\\\
test 2 go OK \\\
'
else:
print'\\\
test 2 did not OK'\
'

以下是控制台的屏幕截图,显示我得到的异常:


另外一个选项是我想过使用是从网络驱动器获取页面的源代码,并做一个查找是否有标签 - 但显然没有简单的方法来做这个在webdriver!

解决方案

如果你不知道哪个例外,使用空的 traceback

  import traceback 

try:
int('string')
除外:
traceback。 print_exc()
打印返回0

#将打印出一个异常并执行except子句中的所有内容:
Traceback(最近一次调用最后一次):
#文件< stdin>,第2行,在< module>
#ValueError:无效文字为int()与基地10:'字符串'
#返回0

但是从堆栈跟踪中,您已经知道确切的异常名称,因此请改用它:

  from selenium.webdriver.exceptions import NoSuchElementException 

try:
#...
除了NoSuchElementException,err:
#...






更新:

try ...除了之前,在之前得到一个异常:

  product_block = browser.find_element_by_xpath(// div [@ class ='page_content']); 

而不是在这里:

  product_name = product_block.find_element_by_xpath(// h2 [@ class ='page_title']); 


I am doing the following in selenium 2/webdrive using python and firefox...

I am opening some web pages that I need to check for a specific string - which, if present, means it is a good page to parse.

The phrase I am looking for is an h2 element similar to this:

<h2 class="page_title">Worlds Of Fantasy : Medieval House</h2>

If that h2 is missing, I know I don't need to work on it, just return and get the next in line.

In the code I have a try/exception/else block that looks for the phrase, if it sees it it passes on to the next part of the function. If not, it should go to the else, which tells it to return.

There are 2 pages called in this test - the first has the phrase, the second does not.

The first page is opened, and passes the test.

The second page is opened, and I get an exception report - but it never returns to the calling code in main...it just stops.

Why isn't the exception fallowing the proper path to return?

Here is the code:

    #!/usr/bin/env python

from selenium import webdriver
from selenium.webdriver import Firefox as Browser
from selenium.webdriver.support.ui import WebDriverWait


browser = webdriver.Firefox()

def call_productpage(productlink):
    global browser

    print 'in call_productpage(' + productlink + ')'
    browser.get(productlink)
    browser.implicitly_wait(8)

    #start block with <div class="page_content"> 
    product_block = browser.find_element_by_xpath("//div[@class='page_content']");

    # <h2 class="page_title">Worlds Of Fantasy : Medieval House</h2>
    try:
        product_name = product_block.find_element_by_xpath("//h2[@class='page_title']");
    except Exception, err:
        #print "Failed!\nError (%s): %s" % (err.__class__.__name__, err)
        print 'return to main()'
        return 0
    else:
        nameStr = str(product_name.text)
        print 'product_name:' + nameStr
    finally:
        print "test over!"
        return 1

test1 = call_productpage('https://www.daz3d.com/i/3d-models/-/desk-clocks?spmeta=ov&item=12657')
if test1:
    print '\ntest 1 went OK\n'
else:
    print '\ntest 1 did NOT go OK\n'

tes2 = call_productpage('https://www.daz3d.com/i/3d-models/-/dierdre-character-pack?spmeta=ov&item=397')
if test2:
    print '\ntest 2 went OK\n'
else:
    print '\ntest 2 did NOT go OK\n'

And here is a screenshot of the console showing the exception I get:

One other option I thought about using was to get the source of the page from the webdriver and do a find to see if the tag was there - but apparently there is no easy way to do THAT in webdriver!

解决方案

If you don't know which exception to expect, use empty except and traceback:

import traceback

try:
    int('string')
except:
    traceback.print_exc()
    print "returning 0"

# will print out an exception and execute everything in the 'except' clause:
# Traceback (most recent call last):
#   File "<stdin>", line 2, in <module>
# ValueError: invalid literal for int() with base 10: 'string'
# returning 0

But from the stack trace you already do know the exact exception name, so use it instead:

from selenium.webdriver.exceptions import NoSuchElementException

try:
    #...
except NoSuchElementException, err:
    #...


UPDATE:

You just get an exception before the try ... except, here:

product_block = browser.find_element_by_xpath("//div[@class='page_content']");

and not here:

product_name = product_block.find_element_by_xpath("//h2[@class='page_title']");

这篇关于寻找当前页面的元素或源文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆