Pandas-展平包含元组的multiindex列,但忽略缺失值 [英] Pandas - flattening a multiindex column containing tuples, but ignore missing values
本文介绍了Pandas-展平包含元组的multiindex列,但忽略缺失值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个这样的多索引熊猫数据框:
I have a multiindex pandas dataframe like this:
lst = [(1, 2), (3, 4), (5, 6), (7, 8), (9, 10), (11, 12), (13, 14), (21, 22)]
df = pd.DataFrame(lst, pd.MultiIndex.from_product([['A', 'B'], ['1','2', '3', '4']])).loc[:('B', '2')]
df["tuple"] = list(zip(df[0], df[1]))
#df:
0 1 tuple
A 1 1 2 (1, 2)
2 3 4 (3, 4)
3 5 6 (5, 6)
4 7 8 (7, 8)
B 1 9 10 (9, 10)
2 11 12 (11, 12)
我想将包含元组的列转换为元组列表.我的方法是:
I want to transform the column, containing the tuples, into a list of tuples. My approach is:
#dataframe to append list of tuples
new_df = pd.DataFrame([1, 2], index = list("AB") )
#voila a list of tuples
new_df["list_of_tuples"] = df["tuple"].unstack(level = -1).values.tolist()
#new_df:
0 list_of_tuples
A 1 [(1, 2), (3, 4), (5, 6), (7, 8)]
B 2 [(9, 10), (11, 12), None, None]
这有效,但仅适用于每个条目具有相同长度的多索引数据帧.如果所有条目的长度都不相同,则缺少的列将导致列表中的None
值.创建列表之前,我删除numpy NaN
值的尝试失败.有没有一种方法可以防止None
在元组的最终列表中出现?
This works, but only for multiindex dataframes with equal length for each entry. If all entries don't have the same length, the missing columns give rise to a None
value in the list. My attempts to remove numpy NaN
values, before creating a list, failed. Is there an approach to prevent the appearance of None
in the final list of tuples?
推荐答案
这是您需要的吗?
df.groupby(level=[0]).tuple.apply(list)
Out[306]:
A [(1, 2), (3, 4), (5, 6), (7, 8)]
B [(9, 10), (11, 12)]
Name: tuple, dtype: object
这篇关于Pandas-展平包含元组的multiindex列,但忽略缺失值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文