无法用美丽的汤解析html表 [英] unable to parse html table with Beautiful Soup

查看:61
本文介绍了无法用美丽的汤解析html表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对使用Beautiful Soup很陌生,我正尝试从下面的url作为pandas数据框导入数据.但是,最终结果具有正确的列名,但行没有编号.我应该怎么做呢?

I am very new to using Beautiful Soup and I'm trying to import data from the below url as a pandas dataframe. However, the final result has the correct columns names, but no numbers for the rows. What should I be doing instead?

这是我的代码:

from bs4 import BeautifulSoup
import requests

def get_tables(html):
    soup = BeautifulSoup(html, 'html.parser')
    table = soup.find_all('table')
    return pd.read_html(str(table))[0]

url = 'https://www.cmegroup.com/trading/interest-rates/stir/eurodollar.html'
html = requests.get(url).content
get_tables(html)

推荐答案

您在表格中看到的数据是通过JavaScript从另一个URL加载的.您可以使用此示例将数据保存到csv:

The data you see in the table is loaded from another URL via JavaScript. You can use this example to save the data to csv:

import json
import requests 
import pandas as pd

data = requests.get('https://www.cmegroup.com/CmeWS/mvc/Quotes/Future/1/G').json()

# uncomment this to print all data:
# print(json.dumps(data, indent=4))

df = pd.json_normalize(data['quotes'])
df.to_csv('data.csv')

保存 data.csv (来自LibreOffice的屏幕截图):

Saves data.csv (screenshot from LibreOffice):

这篇关于无法用美丽的汤解析html表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆