beautifulsoup 第5页 - IT屋-程序员软件开发技术分享社区

美汤根据部分属性值查找标签

我正在尝试根据属性值的一部分来识别 html 文档中的标签. 例如，如果我有一个 Beautifulsoup 对象: 将 bs4 导入为 BeautifulSoupr = requests.get("http://My_Page")汤 = BeautifulSoup(r.text, "html.parser") 我想要带有 id 属性的 tr 标签，其值的格式如下:“news_4343_ ..

发布时间：2021-12-23 20:53:38 python beautifulsoup Python

使用python请求和美丽的汤来拉文本

感谢您查看我的问题.我想知道是否有任何方法可以从本文中提取数据站点密钥...这里是页面的 url https://e-com.secure.force.com/adidasUSContact/ ..

发布时间：2021-12-23 20:53:30 python beautifulsoup python-requests bs4 Python

'NoneType' 对象在 BeautifulSoup 中没有属性 'text'

当我搜索“什么是 2+2"时，我试图抓取 Google 结果，但以下代码返回了 'NoneType' 对象没有属性 'text'.请帮助我实现所需的目标. text="什么是 2+2"search=text.replace(" ","+")链接="https://www.google.com/search?q="+searchheaders={'User-Agent':'Mozilla/5.0 ..

发布时间：2021-12-23 20:53:20 python web-scraping beautifulsoup Python

BeautifulSoup 给了我 unicode+html 符号，而不是直接的 unicode.这是错误还是误解?

我正在使用 BeautifulSoup 抓取网站.该网站的页面在我的浏览器中呈现良好: 乐施会题为“越位！http://www.coopamerica.org/programs/responsibleshopper/company.cfm?id=271 特别是单引号和双引号看起来不错.它们看起来像 html 符号而不是 ascii，但奇怪的是，当我在 FF3 中查看源代码时，它们似乎是 ..

发布时间：2021-12-23 20:53:11 python html unicode beautifulsoup 前端开发

如何用自定义的 <comment> 替换 HTML 注释元素

我正在使用 Python 中的 BeautifulSoup 将大量 HTML 文件批量转换为 XML. 示例 HTML 文件如下所示: ..

发布时间：2021-12-23 20:53:02 python html regex xml beautifulsoup 前端开发

BeautifulSoup:不要在重要的地方添加空格，在不重要的地方删除它们

这个示例python程序: document=''' 这是 something，它发生了在 real生活 '''从 bs4 导入 BeautifulSoup汤 = BeautifulSoup(文档)打印(汤.美化()) 产生以下输出: 这是某物，它发生在真实的生活 ..

发布时间：2021-12-23 20:52:49 python html beautifulsoup 前端开发

为什么我收到“UnicodeEncodeError: 'charmap' codec can't encode character '\u25b2' in position 84811: character maps to <undefined>"?错误?

我收到 UnicodeEncodeError: 'charmap' codec can't encode character '\u200b' in position 756: character maps to 在运行此代码时出错:: from bs4 import BeautifulSoup进口请求r = requests.get('https://stackoverflow.com').t ..

发布时间：2021-12-23 20:52:40 python-3.x web-scraping beautifulsoup encoding 其他开发

BeautifulSoup 找不到网页上存在的类?

所以我试图抓取以下网页 https://www.scoreboard.com/uk/football/england/premier-league/, 特别是预定的和完成的结果.因此，我试图寻找带有 class = "stage-finished" 或 "stage-scheduled" 的元素.但是，当我抓取网页并打印出 page_soup 包含的内容时，它不包含这些元素. 我发现了 ..

发布时间：2021-12-23 20:52:31 python beautifulsoup Python

BeautifulSoup:无法将 NavigableString 转换为字符串

我开始学习 Python，并决定编写一个简单的抓取工具.我遇到的一个问题是我无法将 NavigableString 转换为常规字符串. 使用 BeautifulSoup4 和 Python 3.5.1.我应该硬着头皮去使用早期版本的 Python 和 BeautifulSoup 吗?或者有什么办法我可以编写自己的函数来将 NavigableString 转换为常规的 unicode 字符串? ..

发布时间：2021-12-23 20:52:25 python-3.x beautifulsoup 其他开发

从 python BeautifulSoup 的输出中删除新行 '\n'

我正在使用python Beautiful soup获取内容: abc定义ghi 我的代码如下: html_doc=""" abc定义ghi"""从 bs4 导入 BeautifulSoup汤 = Bea ..

发布时间：2021-12-23 20:52:16 python beautifulsoup Python

需要使用 RegEx 和 BeautifulSoup 查找文本

信息联播:无团体网站:否车站:没有详情坡道:是的 ..

发布时间：2021-12-23 20:52:09 python regex python-2.7 web-scraping beautifulsoup Python

使用 BeautifulSoup 解析由
分隔的行标签?

我有一个看起来像这样的页面: A 公司 123 Main St. 套房 101 某地，NY 1234 公司 B 456 Main St. 某地，NY 1234 有时有两个而不是三个“br"标签来分隔条目.我将如何使用 BeautifulSoup 解析此文档并提取字段?我很难过，因为 ..

发布时间：2021-12-23 20:52:00 python parsing beautifulsoup Python

如何使用BeautifulSoup循环浏览用于网页抓取的网址列表

有谁知道如何通过 Beautifulsoup 从同一网站上抓取网址列表?list = ['url1', 'url2', 'url3'...] ========================================================================== 我提取网址列表的代码: url = 'http://www.hkjc.com/chinese/ ..

发布时间：2021-12-23 20:51:42 python beautifulsoup Python

使用 BeautifulSoup 在评论标签中抓取表格

我正在尝试使用 BeautifulSoup 从以下网页中抓取表格:https://www.pro-football-reference.com/boxscores/201702050atl.htm 导入请求从 bs4 导入 BeautifulSoupurl = 'https://www.pro-football-reference.com/boxscores/201702050atl.htm'页 ..

发布时间：2021-12-23 20:51:35 python web-scraping beautifulsoup Python

Beautiful Soup 不等到页面完全加载

因此，使用下面的代码，我想打开一个公寓网站 URL 并抓取网页.唯一的问题是 Beautiful Soup 不会等到整个网页都被呈现.公寓不会在 html 中呈现，直到它们加载到页面上，这需要几秒钟.我该如何解决这个问题? from urllib.request import urlopen as uReq从 bs4 导入 BeautifulSoup 作为汤my_url = 'https://x ..

发布时间：2021-12-23 20:51:28 python html web-scraping beautifulsoup 前端开发

熊猫 read_html - 没有找到表格

我正在尝试查看是否可以从 WU.com 读取数据表，但由于找不到表而出现类型错误.(这里也是第一次进行网络抓取)还有另一个人有一个非常相似的 stackoverflow 问题 here 使用 WU 数据表，但解决方案对我来说有点复杂. 将pandas导入为pddf_list = pd.read_html('https://www.wunderground.com/history/daily/us ..

发布时间：2021-12-23 20:51:21 python pandas web-scraping beautifulsoup Python

Beautifulsoup 丢失节点

我正在使用 Python 和 Beautifulsoup 来解析 HTML-Data 并从 RSS-Feeds 中获取 p-tags.然而，一些 url 会导致问题，因为解析的汤对象不包括文档的所有节点. 例如，我尝试解析 http://feeds.chicagotribune.com/~r/ChicagoBreakingNews/~3/T2Zg3dk4L88/story01.htm ..

发布时间：2021-12-23 20:51:12 python beautifulsoup html5lib Python

Requests.content 与 Chrome 检查元素不匹配

我正在使用 BeautifulSoup 和 Requests 来抓取所有食谱用户数据. 在检查 HTML 代码时，我发现我想要的数据包含在但是当我使用以下代码时 URL = 'http://allrecipes.com/cook/2010/reviews/'响应 = requests.get(URL ).content汤 = BeautifulSoup(响应，'html.parse ..

发布时间：2021-12-23 20:50:54 python html beautifulsoup python-requests 前端开发

使用 Beautiful Soup 解析 Html 返回空列表

我现在知道为什么这段代码不适用于这个特定站点.在其他情况下它工作正常. url = "http://www.i-apteka.pl/search.php?node=443&counter=all"内容 = requests.get(url).text汤 = BeautifulSoup(内容)链接 = 汤.find_all("a", class_="n63009_prod_link")打印链接 ..

发布时间：2021-12-23 20:50:49 python django parsing beautifulsoup Python

BeautifulSoup HTML 获取 src 链接

我正在使用 python 3.5.1 和请求模块制作一个小型网络爬虫，它从特定网站下载所有漫画.我正在试验一个页面.我使用 BeautifulSoup4 解析页面，如下所示: 导入浏览器导入系统进口请求进口重新进口BS4res = requests.get('http://mangapark.me/manga/berserk/s5/c342')res.raise_for_status()汤 = ..

发布时间：2021-12-23 20:50:42 python html python-3.x beautifulsoup html-parsing 前端开发

beautifulsoup相关内容