用于 HTML 解析的 Python 正则表达式 (BeautifulSoup) [英] Python regular expression for HTML parsing (BeautifulSoup)

查看：50 发布时间：2021/6/25 20:01:42 python regex screen-scraping

本文介绍了用于 HTML 解析的 Python 正则表达式 (BeautifulSoup)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想获取 HTML 中隐藏输入字段的值.

I want to grab the value of a hidden input field in HTML.

<input type="hidden" name="fooId" value="12-3456789-1111111111" />

我想用 Python 编写一个正则表达式来返回 fooId 的值，因为我知道 HTML 中的行遵循格式

I want to write a regular expression in Python that will return the value of fooId, given that I know the line in the HTML follows the format

<input type="hidden" name="fooId" value="**[id is here]**" />

有人可以用 Python 提供一个示例来解析 HTML 的值吗?

Can someone provide an example in Python to parse the HTML for the value?

推荐答案

对于这种特殊情况，BeautifulSoup 比 regex 更难编写，但它更健壮……我只是贡献了 BeautifulSoup 示例，鉴于您已经知道要使用哪个正则表达式 :-)

For this particular case, BeautifulSoup is harder to write than a regex, but it is much more robust... I'm just contributing with the BeautifulSoup example, given that you already know which regexp to use :-)

from BeautifulSoup import BeautifulSoup

#Or retrieve it from the web, etc. 
html_data = open('/yourwebsite/page.html','r').read()

#Create the soup object from the HTML data
soup = BeautifulSoup(html_data)
fooId = soup.find('input',name='fooId',type='hidden') #Find the proper tag
value = fooId.attrs[2][1] #The value of the third attribute of the desired tag 
                          #or index it directly via fooId['value']

这篇关于用于 HTML 解析的 Python 正则表达式 (BeautifulSoup)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

用于 HTML 解析的 Python 正则表达式 (BeautifulSoup) [英] Python regular expression for HTML parsing (BeautifulSoup)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

用于 HTML 解析的 Python 正则表达式 (BeautifulSoup) [英] Python regular expression for HTML parsing (BeautifulSoup)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭