如何使用 Pandas 从一个文件中读取多个数据集? [英] How do I use Pandas to read in multiple datasets from one file?

查看：51 发布时间：2021/6/13 20:39:06 python pandas

本文介绍了如何使用 Pandas 从一个文件中读取多个数据集?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个文件，其中包含由行分隔的多组数据.它看起来像:

I have a file that has multiple sets of data separated by rows. It looks something like:

country1  
0.9  
1.3  
2.9  
1.1  
...  
country2  
4.1  
3.1  
0.2
...

我想使用 Pandas 将整个文件读入多个数据框，其中每个数据框对应一个国家.有什么简单的方法可以做到这一点吗?每个国家/地区都有不同数量的条目.

I would like to use Pandas to read the whole file into multiple dataframes, where each dataframe corresponds to a country. Is there any easy way to do this? Each country has a different number of entries.

推荐答案

您可以通过 to_numeric 和 errors='coerce'，所以得到 NaN 列名.然后通过 isnull 找到它们并创建按 cumsum 分组:

You can create mask by to_numeric with errors='coerce', so get NaN where are column names. Then find them by isnull and create groups by cumsum:

import pandas as pd
import io

temp=u"""country1
0.9
1.3
2.9
1.1
country2
4.1
3.1
0.2"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), index_col=None, header=None)
print (df)
          0
0  country1
1       0.9
2       1.3
3       2.9
4       1.1
5  country2
6       4.1
7       3.1
8       0.2

mask = pd.to_numeric(df.iloc[:,0], errors='coerce').isnull().cumsum()
print (mask)
0    1
1    1
2    1
3    1
4    1
5    2
6    2
7    2
8    2
Name: 0, dtype: int32

最后使用 list comprehension 作为 dataframes 的列表:

Last use list comprehension for list of dataframes:

dfs = [g[1:].rename(columns={0:g.iloc[0].values[0]}) for i, g in df.groupby(mask)]

print (dfs)

print (dfs[0])
  country1
1      0.9
2      1.3
3      2.9
4      1.1

print (dfs[1])
  country2
6      4.1
7      3.1
8      0.2

如果需要重置索引:

dfs = [g[1:].rename(columns={0:g.iloc[0].values[0]}).reset_index(drop=True) for i, g in df.groupby(mask)]

print (dfs)

print (dfs[0])
  country1
0      0.9
1      1.3
2      2.9
3      1.1
print (dfs[1])
  country2
0      4.1
1      3.1
2      0.2

这篇关于如何使用 Pandas 从一个文件中读取多个数据集?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用 Pandas 从一个文件中读取多个数据集? [英] How do I use Pandas to read in multiple datasets from one file?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用 Pandas 从一个文件中读取多个数据集? [英] How do I use Pandas to read in multiple datasets from one file?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭