使用groupby对象--pandas编辑数据框条目 [英] Edit dataframe entries using groupby object --pandas
问题描述
请考虑以下数据框:
index count signal
1 1 1
2 1 NAN
3 1 NAN
4 1 -1
5 1 NAN
6 2 NAN
7 2 -1
8 2 NAN
9 3 NAN
10 3 NAN
11 3 NAN
12 4 1
13 4 NAN
14 4 NAN
我需要在信号"中填充" NAN,并且具有不同计数"值的值不应相互影响.这样我应该得到以下数据框:
I need to 'ffill' the NANs in 'signal' and values with different 'count' value should not affect each other. such that I should get the following dataframe:
index count signal
1 1 1
2 1 1
3 1 1
4 1 -1
5 1 -1
6 2 NAN
7 2 -1
8 2 -1
9 3 NAN
10 3 NAN
11 3 NAN
12 4 1
13 4 1
14 4 1
现在,我逐个对象地遍历每个数据帧并填充NAN值,然后复制到新的数据帧:
Right now I iterate through each data frame in group by object and fill NAN value and then copy to a new data frame:
new_table = np.array([]);
for key, group in df.groupby('count'):
group['signal'] = group['signal'].fillna(method='ffill')
group1 = group.copy()
if new_table.shape[0]==0:
new_table = group1
else:
new_table = pd.concat([new_table,group1])
这有点奏效,但考虑到数据帧很大,这真的很慢.我想知道是否有其他方法可以使用groupby方法或不使用groupby方法.谢谢!
which kinda works, but really slow considering the data frame is large. I am wondering if there is any other method to do it with or without groupby methods. Thanks!
已编辑
感谢Alexander和jwilner提供了替代方法.但是,对于我的拥有80万行数据的大数据框,这两种方法都非常慢.
Thanks to Alexander and jwilner for providing alternative methods. However both methods are very slow for my big dataframe which has 800,000 rows of data.