在Python中的Stack操作期间保留少量NA并删除其余NA [英] Retain few NA's and drop rest of NA's during Stack operation in Python
问题描述
我有一个如下所示的数据框
I have a dataframe like shown below
df2 = pd.DataFrame({'person_id':[1],'H1_date' : ['2006-10-30 00:00:00'], 'H1':[2.3],'H2_date' : ['2016-10-30 00:00:00'], 'H2':[12.3],'H3_date' : ['2026-11-30 00:00:00'], 'H3':[22.3],'H4_date' : ['2106-10-30 00:00:00'], 'H4':[42.3],'H5_date' : [np.nan], 'H5':[np.nan],'H6_date' : ['2006-10-30 00:00:00'], 'H6':[2.3],'H7_date' : [np.nan], 'H7':[2.3],'H8_date' : ['2006-10-30 00:00:00'], 'H8':[np.nan]})
如上面的屏幕快照所示,我的源datframe(df2
)包含少量NA's
As shown in my screenshot above, my source datframe (df2
) contains few NA's
当我执行df2.stack()
时,我会丢失数据中的所有NA.
When I do df2.stack()
, I lose all the NA's from the data.
但是,我想保留H7_date
和H8
的NA,因为它们具有对应的值/日期对.对于H7_date
,我有一个有效值H7
,对于H8
,我有对应的H8_date
.
However I would like to retain NA for H7_date
and H8
because they have got their corresponding value / date pair. For H7_date
, I have a valid value H7
and for H8
, I have got it's corresponding H8_date
.
仅当两个值(H5_date
,H5
)均为NA时,我才想删除记录.
I would like to drop records only when both the values (H5_date
,H5
) are NA.
请注意,这里我只有很少的列,而我的真实数据有150多个列,并且列名是事先未知的.
Please note I have got only few columns here and my real data has more than 150 columns and column names aren't known in advance.
我希望我的输出如下图所示,尽管它们是NA的,但没有H5_date
,H5
I expect my output to be like as shown below which doesn't have H5_date
,H5
though they are NA's
推荐答案
df = pd.melt(df2, id_vars='person_id', var_name='col', value_name='dates')
df['col2'] = df['col'].str.split("_").str[0]
df['count'] = df.groupby(['col2'])['dates'].transform(pd.Series.count)
df = df[df['count'] != 0]
df.drop(['col2', 'count'], axis=1, inplace=True)
print(df)
person_id col dates
0 1 H1_date 2006-10-30 00:00:00
1 1 H1 2.3
2 1 H2_date 2016-10-30 00:00:00
3 1 H2 12.3
4 1 H3_date 2026-11-30 00:00:00
5 1 H3 22.3
6 1 H4_date 2106-10-30 00:00:00
7 1 H4 42.3
10 1 H6_date 2006-10-30 00:00:00
11 1 H6 2.3
12 1 H7_date NaN
13 1 H7 2.3
14 1 H8_date 2006-10-30 00:00:00
15 1 H8 NaN
这篇关于在Python中的Stack操作期间保留少量NA并删除其余NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!