isinstance不beautifulsoup正常工作(NameError) [英] isinstance not working correctly with beautifulsoup(NameError)

查看:708
本文介绍了isinstance不beautifulsoup正常工作(NameError)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用isinstance选择一些html标签,并将其传递到Beautifulsoup功能。问题是我保持距离应该是什么完全执行code得到NameErrors。

I'm using isinstance to select some html tags and passing them to a Beautifulsoup function. The problem is I keep getting NameErrors from what should be perfectly executable code.

def horse_search(tag):
    return (tag.has_attr('href') and isinstance(tag.previous_element, span))

...

for tag in soup.find_all(horse_search):
   print (tag)    

NameError:全局名称跨度没有定义

NameError: global name 'span' is not defined

另外,我从Beautifulsoup的文档中的例子code在标签。previous_element

Also I'm getting errors from the example code in the documentation of Beautifulsoup using isinstance in conjunction with tag.previous_element

def surrounded_by_strings(tag):
    return (isinstance(tag.next_element, NavigableString)
            and isinstance(tag.previous_element, NavigableString))

for tag in soup.find_all(surrounded_by_strings):
    print tag.name

NameError:全局名称NavigableString没有定义

NameError: global name "NavigableString" is not defined

可能是什么问题?谢谢!

What could be wrong? Thanks!

推荐答案

要找到具有跨度家长和href属性做主播:

to find all anchors that has a span parent and an href attribute do:

for span in soup.find_all('span'):
    for a in span.find_all('a'):
        if a.has_attr('href'):
            print a['href']

不过,虽然这是很好的,因为在大多数情况下,使用一些工具,它支持XPath可以更好,例如,使用LXML和XPath您code可以看起来像整齐的:

however, while this is nice, as in most cases, using some tool that supports xpath can be even better, for example, using lxml and xpath you code can look as neat as:

from lxml import etree
etree.parse(url, etree.HTMLParser()).xpath('//span/a/@href')

这篇关于isinstance不beautifulsoup正常工作(NameError)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆