使用String拆分pandas dataframe [英] Split pandas dataframe by String

查看:389
本文介绍了使用String拆分pandas dataframe的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是使用Pandas数据框架的新手。我有这样的.csv数据:

I'm new to using Pandas dataframes. I have data in a .csv like this:

foo, 1234,
bar, 4567
stuff, 7894
New Entry,,
morestuff,1345

将它插入到数据框中

 df = pd.read_csv

但我真正想要的是每次我有一个新条目行(显然不包括它)一个新的数据框架(或分裂当前的一种方式)。

But what I really want is a new dataframe (or a way of splitting the current one) every time I have a "New Entry" line (obviously without including it). How could this be done?

推荐答案

1)在逐行读取文件的同时执行此操作,并检查 NewEntry break是一种方法。

1) Doing it on the fly while reading the file line-by-line and checking for NewEntry break is one approach.

其他方式,如果dataframe已经存在, c $ c> NewEntry 并将数据帧分成多个 dff = {}

2) Other way, if the dataframe already exists is to find the NewEntry and slice the dataframe into multiple ones to dff = {}

df                                                                 
        col1  col2  
0        foo  1234    
1        bar  4567                
2      stuff  7894                                                        
3   NewEntry   NaN                       
4  morestuff  1345 

查找 NewEntry 为边界条件添加 [ - 1] [len(df.index)] b

Find the NewEntry rows, add [-1] and [len(df.index)] for boundary conditions

rows = [-1] + np.where(df['col1']=='NewEntry')[0].tolist() + [len(df.index)]
[-1, 3L, 5]

of dataframes

Create the dict of dataframes

dff = {}                                                                            
for i, r in enumerate(rows[:-1]):                                                   
    dff[i] = df[r+1: rows[i+1]]                                                     

数据框架说明{0:datafram1,1:dataframe2}

Dict of dataframes {0: datafram1, 1: dataframe2}

dff                           
{0:     col1  col2            
 0    foo  1234               
 1    bar  4567               
 2  stuff  7894, 1:         col1  col2  
 4  morestuff  1345}

Dataframe 1

Dataframe 1

dff[0]              
    col1  col2      
0    foo  1234      
1    bar  4567      
2  stuff  7894      

Dataframe 2

Dataframe 2

dff[1]              
        col1  col2  
4  morestuff  1345 

这篇关于使用String拆分pandas dataframe的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆