从0/1数据框到项目集列表的python pandas [英] python pandas from 0/1 dataframe to an itemset list
本文介绍了从0/1数据框到项目集列表的python pandas 的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
从这种形式的0/1 pandas/numpy数据框中获取数据的最有效方法是:
What is the most efficient way to go from a 0/1 pandas/numpy dataframe of this form::
>>> dd
{'a': {0: 1, 1: 0, 2: 1, 3: 0, 4: 1, 5: 1},
'b': {0: 1, 1: 1, 2: 0, 3: 0, 4: 1, 5: 1},
'c': {0: 0, 1: 1, 2: 1, 3: 0, 4: 1, 5: 1},
'd': {0: 0, 1: 1, 2: 1, 3: 1, 4: 0, 5: 1},
'e': {0: 0, 1: 0, 2: 1, 3: 0, 4: 0, 5: 0}}
>>> df = pd.DataFrame(dd)
>>> df
a b c d e
0 1 1 0 0 0
1 0 1 1 1 0
2 1 0 1 1 1
3 0 0 0 1 0
4 1 1 1 0 0
5 1 1 1 1 0
>>>
到列表的项目集列表?::
To an itemset list of list ?::
itemset = [['a', 'b'],
['b', 'c', 'd'],
['a', 'c', 'd', 'e'],
['d'],
['a', 'b', 'c'],
['a', 'b', 'c', 'd']]
df.shape〜(1e6, 500)
df.shape ~ (1e6, 500)
推荐答案
您可以先通过 values
:
You can first multiple by columns names by mul
and convert DataFrame
to numpy array
by values
:
print (df.mul(df.columns.to_series()).values)
[['a' 'b' '' '' '']
['' 'b' 'c' 'd' '']
['a' '' 'c' 'd' 'e']
['' '' '' 'd' '']
['a' 'b' 'c' '' '']
['a' 'b' 'c' 'd' '']]
通过嵌套列表理解删除空字符串:
Remove empty string by nested list comprehension:
print ([[y for y in x if y != ''] for x in df.mul(df.columns.to_series()).values])
[['a', 'b'],
['b', 'c', 'd'],
['a', 'c', 'd', 'e'],
['d'],
['a', 'b', 'c'],
['a', 'b', 'c', 'd']]
这篇关于从0/1数据框到项目集列表的python pandas 的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文