pandas :将长度不等的列表列拆分为多列 [英] Pandas: split column of lists of unequal length into multiple columns
问题描述
我有一个如下所示的Pandas数据框:
I have a Pandas dataframe that looks like the below:
codes
1 [71020]
2 [77085]
3 [36415]
4 [99213, 99287]
5 [99233, 99233, 99233]
我正在尝试将df['codes']
中的列表分成几列,如下所示:
I'm trying to split the lists in df['codes']
into columns, like the below:
code_1 code_2 code_3
1 71020
2 77085
3 36415
4 99213 99287
5 99233 99233 99233
其中没有值的列(因为列表没有那么长)用空白或NaN或其他内容填充.
where columns that don't have a value (because the list was not that long) are filled with blanks or NaNs or something.
我已经看到诸如这个问题这样的答案,并且其他类似的方法,当它们在等长列表上工作时,当我尝试在等长列表上使用这些方法时,它们都会抛出错误.有什么好方法吗?
I've seen answers like this one and others similar to it, and while they work on lists of equal length, they all throw errors when I try to use the methods on lists of unequal length. Is there a good way do to this?
推荐答案
尝试:
pd.DataFrame(df.codes.values.tolist()).add_prefix('code_')
code_0 code_1 code_2
0 71020 NaN NaN
1 77085 NaN NaN
2 36415 NaN NaN
3 99213 99287.0 NaN
4 99233 99233.0 99233.0
包含index
pd.DataFrame(df.codes.values.tolist(), df.index).add_prefix('code_')
code_0 code_1 code_2
1 71020 NaN NaN
2 77085 NaN NaN
3 36415 NaN NaN
4 99213 99287.0 NaN
5 99233 99233.0 99233.0
我们可以通过以下方式确定所有格式:
We can nail down all the formatting with this:
f = lambda x: 'code_{}'.format(x + 1)
pd.DataFrame(
df.codes.values.tolist(),
df.index, dtype=object
).fillna('').rename(columns=f)
code_1 code_2 code_3
1 71020
2 77085
3 36415
4 99213 99287
5 99233 99233 99233
这篇关于 pandas :将长度不等的列表列拆分为多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!