beautifulsoup解析 - 处理上标？ [英] beautifulsoup parsing - dealing with superscript?

查看：122 发布时间：2016/8/5 19:12:47 python html beautifulsoup

本文介绍了beautifulsoup解析 - 处理上标？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是HTML片段我试图提取信息：

This is the HTML segment I am trying to extract information from:

<td class="yfnc_tablehead1" width="74%">Market Cap (intraday)<font size="-1"><sup>5</sup></font>:</td><td class="yfnc_tabledata1"><span id="yfs_j10_aal">33.57B</span></td></tr>

在网页如下：

市值（盘中）⁵：33.57B

Market Cap (intraday)⁵:33.57B

我所（不工作）：

    HTML_MarketCap = soup.find('sup', text='5').find_next_sibling('span').text

我

怎么能提取33.57B字符串？

How could I extract the 33.57B string?

推荐答案

跨度是不是兄弟姐妹，这是一个~~祖父母线~~的堂兄，一旦删除（感谢兄弟姐妹的孩子，1.618 ）。

The span is not a sibling, it is a ~~child of the sibling of the grandparent~~ first cousin, once removed (thanks, 1.618).

from bs4 import BeautifulSoup as bs
soup = bs("""<td class="yfnc_tablehead1" width="74%">Market Cap (intraday)
<font size="-1"><sup>5</sup></font>:</td><td class="yfnc_tabledata1">
<span id="yfs_j10_aal">33.57B</span></td></tr>""")

soup.find("sup", text="5").parent.parent.find_next_sibling("td").find("span").text
# u'33.57B'

既然你似乎有它的问题，这里的（使用蟒蛇，请求我的全部测试脚本），可靠地工作对我来说：

Since you seem to have problems with it, here's my full test script (using python-requests), that reliably works for me:

import requests
from bs4 import BeautifulSoup as bs

url = "https://finance.yahoo.com/q/ks?s=AAL+Key+Statistics"

r = requests.get(url)

soup = bs(r.text)

HTML_MarketCap = soup.find("sup", text="5").parent.parent.find_next_sibling("td").find("span").text

print HTML_MarketCap

这篇关于beautifulsoup解析 - 处理上标？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

beautifulsoup解析 - 处理上标？ [英] beautifulsoup parsing - dealing with superscript?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

beautifulsoup解析 - 处理上标？ [英] beautifulsoup parsing - dealing with superscript?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭