将多个抓取的表格导出到单个CSV文件中 [英] Exporting several scraped tables into a single CSV File

查看:73
本文介绍了将多个抓取的表格导出到单个CSV文件中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何连接从多个HTML读取的表?我了解它们被认为是列表,并且列表无法连接,但是如何将从不同URL抓取的一个以上的表插入一个CSV?有任何想法吗?是否可以将打印输出保存到变量中,然后将其移动到CSV中?

How can I concatenate the tables read from several HTML? I understand they are considered lists and lists are not possible to concatenate, but then how can I insert more than one table scraped from a different URL into one single CSV? Any ideas? Is it possible to save the print output in a variable and then move it into a CSV?

import pandas as pd

df = pd.read_html('URL')
df1 = pd.read_html('URL')

print(df, df1)

(**df,df1**).to_csv('name.csv')


属性( df,df1 )当然是不正确的,只是将其写成描述我所缺少的内容.


The attribute (df,df1) is of course incorrect, just wrote it to describe what I am missing.

非常感谢您

推荐答案

pd.read_html返回数据帧列表.因此,如果您确定列表中包含以可连接方式格式化的数据框,则可以将其合并然后合并为单个数据框,然后将其导出到csv:

pd.read_html returns a list of dataframes. So, in case you are sure that the lists contains dataframes formated in a way that can be concatenated you can consolidate then into a single dataframe, then export it to csv:

import pandas as pd

dframes_list1 = pd.read_html('URL1')
dframes_list2 = pd.read_html('URL2')
dframes_all = dframes_list1 + dframes_list2
consolidated_dframe = pd.concat(dframes_all)
consolidated_dframe.to_csv('name.csv')

这篇关于将多个抓取的表格导出到单个CSV文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆