查找特定链接瓦特/ beautifulsoup [英] Find specific link w/ beautifulsoup

查看：118 发布时间：2016/8/5 19:03:41 python regex beautifulsoup

本文介绍了查找特定链接瓦特/ beautifulsoup的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

您好我无法弄清楚如何找到它与特定文本为我的生活开始链接。
的findall（'A'）工作正常，但它的方式太多了。我只是想与开头的所有链接的列表
http://www.nhl.com/ice/boxscore.htm?id=

Hi I cannot figure out how to find links which begin with certain text for the life of me. findall('a') works fine, but it's way too much. I just want to make a list of all links that begin with http://www.nhl.com/ice/boxscore.htm?id=

谁能帮我？

非常感谢你。

推荐答案

首先建立了一个测试文档，开辟解析器与BeautifulSoup：

First set up a test document and open up the parser with BeautifulSoup:

>>> from BeautifulSoup import BeautifulSoup
>>> doc = '<html><body><div><a href="something">yep</a></div><div><a href="http://www.nhl.com/ice/boxscore.htm?id=3">somelink</a></div><a href="http://www.nhl.com/ice/boxscore.htm?id=7">another</a></body></html>'
>>> soup = BeautifulSoup(doc)
>>> print soup.prettify()
<html>
 <body>
  <div>
   <a href="something">
    yep
   </a>
  </div>
  <div>
   <a href="http://www.nhl.com/ice/boxscore.htm?id=3">
    somelink
   </a>
  </div>
  <a href="http://www.nhl.com/ice/boxscore.htm?id=7">
   another
  </a>
 </body>
</html>

接下来，我们可以搜索所有的＆LT; A＆GT; 带标签的的href 属性开始 http://www.nhl.com/ice/boxscore.htm?id= 。您可以使用常规的前pression吧：

Next, we can search for all <a> tags with an href attribute starting with http://www.nhl.com/ice/boxscore.htm?id=. You can use a regular expression for it:

>>> import re
>>> soup.findAll('a', href=re.compile('^http://www.nhl.com/ice/boxscore.htm\?id='))
[<a href="http://www.nhl.com/ice/boxscore.htm?id=3">somelink</a>, <a href="http://www.nhl.com/ice/boxscore.htm?id=7">another</a>]

这篇关于查找特定链接瓦特/ beautifulsoup的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

查找特定链接瓦特/ beautifulsoup [英] Find specific link w/ beautifulsoup

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

查找特定链接瓦特/ beautifulsoup [英] Find specific link w/ beautifulsoup

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭