我如何使用Python从HTML获取href链接？ [英] How can I get href links from HTML using Python?

查看：1160 发布时间：2018/6/13 10:41:35 python html hyperlink beautifulsoup href

本文介绍了我如何使用Python从HTML获取href链接？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

  import urllib2 
 $ b website =WEBSITE
 openwebsite = urllib2.urlopen（网站）
 html = getwebsite.read（） 
 
 print html

到目前为止这么好。

但我只需要纯文本HTML的href链接。我怎么解决这个问题？

noreferrer> Beautifulsoup ：
from BeautifulSoup import BeautifulSoup import urllib2 import re html_page = urllib2.urlopen（http://www.yourwebsite.com） soup = BeautifulSoup（html_page） for soup.findAll（'a'）中的链接： print link.get（'href'）
http：// ，您应该使用：

soup.findAll （'a'，attrs = {'href'：re.compile（^ http：//）}）

import urllib2 website = "WEBSITE" openwebsite = urllib2.urlopen(website) html = getwebsite.read() print html
So far so good.

But I want only href links from the plain text HTML. How can I solve this problem?
解决方案
Try with Beautifulsoup:
from BeautifulSoup import BeautifulSoup import urllib2 import re html_page = urllib2.urlopen("http://www.yourwebsite.com") soup = BeautifulSoup(html_page) for link in soup.findAll('a'): print link.get('href')
In case you just want links starting with http://, you should use:
soup.findAll('a', attrs={'href': re.compile("^http://")})

这篇关于我如何使用Python从HTML获取href链接？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

我如何使用Python从HTML获取href链接？ [英] How can I get href links from HTML using Python?

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

我如何使用Python从HTML获取href链接？ [英] How can I get href links from HTML using Python?

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭