使用BeautifulSoup提取链接的标题 [英] Using BeautifulSoup to extract the title of a link

查看：189 发布时间：2020/9/20 6:07:19 python python-2.7 web-scraping beautifulsoup python-requests

本文介绍了使用BeautifulSoup提取链接的标题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用BeautifulSoup提取链接的标题.我正在使用的代码如下:

I'm trying to extract the title of a link using BeautifulSoup. The code that I'm working with is as follows:

url = "http://www.example.com"
source_code = requests.get(url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text, "lxml")
for link in soup.findAll('a', {'class': 'a-link-normal s-access-detail-page  a-text-normal'}):
    title = link.get('title')
    print title

现在，示例link元素包含以下内容:

Now, an example link element contains the following:

<a class="a-link-normal s-access-detail-page a-text-normal" href="http://www.amazon.in/Introduction-Computation-Programming-Using-Python/dp/8120348664" title="Introduction To Computation And Programming Using Python"><h2 class="a-size-medium a-color-null s-inline s-access-title a-text-normal">Introduction To Computation And Programming Using <strong>Python</strong></h2></a>

但是，运行上述代码后，什么都没有显示.如何提取存储在link中的定位标记的title属性中的值?

However, nothing gets displayed after I run the above code. How can I extract the value stored inside the title attribute of the anchor tag stored in link?

推荐答案

好吧，看来您在s-access-detail-page和a-text-normal之间放置了两个空格，从而找不到任何匹配的链接.尝试使用正确数量的空格，然后打印找到的链接数量.另外，您可以打印标签本身-print link

Well, it seems you have put two spaces between s-access-detail-page and a-text-normal, which in turn, is not able to find any matching link. Try with correct number of spaces, then printing number of links found. Also, you can print the tag itself - print link

import requests
from bs4 import BeautifulSoup

url = "http://www.amazon.in/s/ref=nb_sb_noss_1?url=search-alias%3Daps&field-keywords=python"
source_code = requests.get(url)
plain_text = source_code.content
soup = BeautifulSoup(plain_text, "lxml")
links = soup.findAll('a', {'class': 'a-link-normal s-access-detail-page a-text-normal'})
print len(links)
for link in links:
    title = link.get('title')
    print title

这篇关于使用BeautifulSoup提取链接的标题的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用BeautifulSoup提取链接的标题 [英] Using BeautifulSoup to extract the title of a link

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用BeautifulSoup提取链接的标题 [英] Using BeautifulSoup to extract the title of a link

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭