BeautifulSoup find和find_all不能按预期工作 [英] BeautifulSoup find and find_all not working as expect
问题描述
我刚刚开始使用BeautifulSoup,但遇到了问题.我在下面设置了一个html代码段,并创建了一个BeautifulSoup对象:
I just starting using BeautifulSoup and I am encountering a problem. I set up a html snippet below and make a BeautifulSoup object:
html_snippet = '<p class="course"><span class="text84">Ae 100. Research in Aerospace. </span><span class="text85">Units to be arranged in accordance with work accomplished. </span><span class="text83">Open to suitably qualified undergraduates and first-year graduate students under the direction of the staff. Credit is based on the satisfactory completion of a substantive research report, which must be approved by the Ae 100 adviser and by the option representative. </span> </p>'
subject = BeautifulSoup(html_snippet)
我已经尝试过执行以下几种find和find_all操作,但是我得到的只是什么都不是或一个空列表:
I have tried doing several find and find_all operations like below but all I am getting is nothing or an empty list:
subject.find(text = 'A')
subject.find(text = 'Research')
subject.next_element.find('A')
subject.find_all(text = 'A')
以前,当我从计算机上的html文件创建BeautifulSoup对象时,find和find_all操作都工作正常.但是,当我从通过urllib2在线阅读网页中拉出html_snippet时,出现了问题.
When I created the BeautifulSoup object from a html file on my computer before, the find and find_all operations were all working fine. However, when I pulled the html_snippet from reading a webpage online through urllib2, I am getting problems.
谁能指出问题出在哪里?
Can anyone point out where the issue is?
推荐答案
像这样传递参数:
import re
subject.find(text=re.compile('A'))
text
过滤器的默认行为是匹配整个身体.传递正则表达式可让您在片段上进行匹配.
The default behavior for the text
filter is to match on the entire body. Passing in a regular expression lets you match on fragments.
若要仅匹配以A
开头的正文,可以使用以下命令:
To match only bodies beginning with A
, you can use the following:
subject.find(text=re.compile('^A'))
要仅匹配包含以A
开头的单词的正文,可以使用:
To match only bodies containing words that begin with A
, you can use:
subject.find_all(text = re.compile(r'\bA'))
很难确切地说出您要查找的内容,如果我误解了您的要求,请告诉我.
It's difficult to tell more specifically what you're looking for, let me know if I've misinterpreted what you're asking.
这篇关于BeautifulSoup find和find_all不能按预期工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!