找到带有beautifulsoup的特定链接 [英] Find specific link w/ beautifulsoup

查看：15 发布时间：2021/12/23 20:06:10 python regex beautifulsoup

本文介绍了找到带有beautifulsoup的特定链接的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我不知道如何在我的一生中找到以某些文本开头的链接.findall('a') 工作正常，但它太多了.我只想列出所有以http://www.nhl.com/ice/boxscore.htm?id=

Hi I cannot figure out how to find links which begin with certain text for the life of me. findall('a') works fine, but it's way too much. I just want to make a list of all links that begin with http://www.nhl.com/ice/boxscore.htm?id=

有人可以帮我吗?

非常感谢

推荐答案

先设置一个测试文档，用BeautifulSoup打开解析器:

First set up a test document and open up the parser with BeautifulSoup:

>>> from BeautifulSoup import BeautifulSoup >>> doc = '<html><body><div><a href="something">yep</a></div><div><a href="http://www.nhl.com/ice/boxscore.htm?id=3">somelink</a></div><a href="http://www.nhl.com/ice/boxscore.htm?id=7">another</a></body></html>' >>> soup = BeautifulSoup(doc) >>> print soup.prettify() <html> <body> <div> <a href="something"> yep </a> </div> <div> <a href="http://www.nhl.com/ice/boxscore.htm?id=3"> somelink </a> </div> <a href="http://www.nhl.com/ice/boxscore.htm?id=7"> another </a> </body> </html>

Next, we can search for all <a> tags with an href attribute starting with http://www.nhl.com/ice/boxscore.htm?id=. You can use a regular expression for it:

接下来，我们可以搜索所有带有 href 属性的标签，以 http://www.nhl.com/ice/boxscore.htm?id=.您可以为它使用正则表达式:

>>> import re >>> soup.findAll('a', href=re.compile('^http://www.nhl.com/ice/boxscore.htm?id=')) [<a href="http://www.nhl.com/ice/boxscore.htm?id=3">somelink</a>, <a href="http://www.nhl.com/ice/boxscore.htm?id=7">another</a>]

这篇关于找到带有beautifulsoup的特定链接的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

找到带有beautifulsoup的特定链接 [英] Find specific link w/ beautifulsoup

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

找到带有beautifulsoup的特定链接 [英] Find specific link w/ beautifulsoup

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭