使用 BeautifulSoup 提取标题 [英] Extract title with BeautifulSoup

查看：24 发布时间：2021/12/23 20:08:10 python-3.x beautifulsoup

本文介绍了使用 BeautifulSoup 提取标题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有这个

from urllib import request
url = "http://www.bbc.co.uk/news/election-us-2016-35791008"
html = request.urlopen(url).read().decode('utf8')
html[:60]

from bs4 import BeautifulSoup
raw = BeautifulSoup(html, 'html.parser').get_text()
raw.find_all('title', limit=1)
print (raw.find_all("title"))
'<!doctype html public "-//W3C//DTD HTML 4.0 Transitional//EN'

我想使用 BeautifulSoup 提取页面的标题，但出现此错误

I want to extract the title of the page using BeautifulSoup but getting this error

Traceback (most recent call last):
  File "C:UsersPassanovaAppDataLocalProgramsPythonPython35-32	est.py", line 8, in <module>
    raw.find_all('title', limit=1)
AttributeError: 'str' object has no attribute 'find_all'

请提出任何建议

推荐答案

要导航汤，您需要一个 BeautifulSoup 对象，而不是字符串.所以删除你对汤的 get_text() 调用.

To navigate the soup, you need a BeautifulSoup object, not a string. So remove your get_text() call to the soup.

此外，您可以将 raw.find_all('title', limit=1) 替换为等效的 find('title').

Moreover, you can replace raw.find_all('title', limit=1) with find('title') which is equivalent.

试试这个:

from urllib import request
url = "http://www.bbc.co.uk/news/election-us-2016-35791008"
html = request.urlopen(url).read().decode('utf8')
html[:60]

from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
title = soup.find('title')

print(title) # Prints the tag
print(title.string) # Prints the tag string content

这篇关于使用 BeautifulSoup 提取标题的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用 BeautifulSoup 提取标题 [英] Extract title with BeautifulSoup

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用 BeautifulSoup 提取标题 [英] Extract title with BeautifulSoup

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭