从价值HREF源蟒蛇提取物ID [英] python extract id value from href source

查看：118 发布时间：2016/8/5 19:14:25 python regex beautifulsoup

本文介绍了从价值HREF源蟒蛇提取物ID的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经成功地提取HREF URI的使用beautifulsoup从页面的源代码，但是我现在想提取下面的例子中的多个实例的UID值：

例如

 ＆LT; A HREF =？test.html的UID = 5444974＆GT;
＆LT;？test.html的UID = 5444972A HREF =＆GT;
＆LT;？test.html的UID = 54444972A HREF =＆GT;

帮助将不胜AP preciated！

解决方案

 ＆GT;＆GT;＆GT; HTML
'＆LT; A HREF =？test.html的UID = 5444974＆GT; \\ n＆LT; A HREF =？test.html的UID = 5444972＆GT; \\ n＆LT; A HREF =？test.html的UID = 54444972＆GT ;'
＆GT;＆GT;＆GT;汤= BeautifulSoup（HTML）
＆GT;＆GT;＆GT;屁股= soup.find_all（'A'）
＆GT;＆GT;＆GT; R = re.compile（'UID =（\\ D +））
＆GT;＆GT;＆GT;的uid = []
＆GT;＆GT;＆GT;一个在屁股：
... uids.append（r.search（一个['的href']）。组（1））
...
＆GT;＆GT;＆GT;的UID
['5444974'，'5444972'，'54444972']
＆GT;＆GT;＆GT;

I've managed to extract the href URI's using beautifulsoup from the source of the page, however I now want to extract the UID value from multiple instances of the example below:

e.g

<a href="test.html?uid=5444974">
<a href="test.html?uid=5444972">
<a href="test.html?uid=54444972">

Help would be greatly appreciated!

解决方案

>>> html
'<a href="test.html?uid=5444974">\n<a href="test.html?uid=5444972">\n<a href="test.html?uid=54444972">'
>>> soup = BeautifulSoup(html)
>>> ass = soup.find_all('a')
>>> r = re.compile('uid=(\d+)')
>>> uids = []
>>> for a in ass:
...     uids.append(r.search(a['href']).group(1))
... 
>>> uids
['5444974', '5444972', '54444972']
>>>

这篇关于从价值HREF源蟒蛇提取物ID的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从价值HREF源蟒蛇提取物ID [英] python extract id value from href source

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

从价值HREF源蟒蛇提取物ID [英] python extract id value from href source

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭