使用String拆分pandas dataframe [英] Split pandas dataframe by String
问题描述
我是使用Pandas数据框架的新手。我有这样的.csv数据:
I'm new to using Pandas dataframes. I have data in a .csv like this:
foo, 1234,
bar, 4567
stuff, 7894
New Entry,,
morestuff,1345
将它插入到数据框中
df = pd.read_csv
但我真正想要的是每次我有一个新条目行(显然不包括它)一个新的数据框架(或分裂当前的一种方式)。
But what I really want is a new dataframe (or a way of splitting the current one) every time I have a "New Entry" line (obviously without including it). How could this be done?
推荐答案
1)在逐行读取文件的同时执行此操作,并检查 NewEntry
break是一种方法。
1) Doing it on the fly while reading the file line-by-line and checking for NewEntry
break is one approach.
其他方式,如果dataframe已经存在, c $ c> NewEntry 并将数据帧分成多个 dff = {}
2) Other way, if the dataframe already exists is to find the NewEntry
and slice the dataframe into multiple ones to dff = {}
df
col1 col2
0 foo 1234
1 bar 4567
2 stuff 7894
3 NewEntry NaN
4 morestuff 1345
查找 NewEntry
为边界条件添加 [ - 1]
和 [len(df.index)]
b
Find the NewEntry
rows, add [-1]
and [len(df.index)]
for boundary conditions
rows = [-1] + np.where(df['col1']=='NewEntry')[0].tolist() + [len(df.index)]
[-1, 3L, 5]
of dataframes
Create the dict of dataframes
dff = {}
for i, r in enumerate(rows[:-1]):
dff[i] = df[r+1: rows[i+1]]
数据框架说明{0:datafram1,1:dataframe2}
Dict of dataframes {0: datafram1, 1: dataframe2}
dff
{0: col1 col2
0 foo 1234
1 bar 4567
2 stuff 7894, 1: col1 col2
4 morestuff 1345}
Dataframe 1
Dataframe 1
dff[0]
col1 col2
0 foo 1234
1 bar 4567
2 stuff 7894
Dataframe 2
Dataframe 2
dff[1]
col1 col2
4 morestuff 1345
这篇关于使用String拆分pandas dataframe的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!