将 pandas 数据框转换为字典 [英] convert a pandas dataframe to dictionary
问题描述
我有一个熊猫数据框,如下所示:
I have a pandas dataframe as below:
df=pd.DataFrame({'a':['red','yellow','blue'], 'b':[0,0,1], 'c':[0,1,0], 'd':[1,0,0]})
df
看起来像
a b c d
0 red 0 0 1
1 yellow 0 1 0
2 blue 1 0 0
我想将其转换为字典,以便得到:
I want to convert it to a dictionary so that I get:
red d
yellow c
blue b
如果数据集很大,那么请避免使用任何迭代方法.我还没有找到解决方案.感谢您的帮助.
The dataset if quite large, so please avoid any iterative method. I haven't figured out a solution yet. Any help is appreciated.
推荐答案
首先,如果您确实希望将其转换为字典,则将所需的键值转换为索引的索引会更好一些.数据框:
First of all, if you really want to convert this to a dictionary, it's a little nicer to convert the value you want as a key into the index of the DataFrame:
df.set_index('a', inplace=True)
这看起来像:
b c d
a
red 0 0 1
yellow 0 1 0
blue 1 0 0
您的数据似乎是一次性"编码.首先,您必须使用此处详细介绍的方法:
Your data appears to be in "one-hot" encoding. You first have to reverse that, using the method detailed here:
series = df.idxmax(axis=1)
这看起来像:
a
red d
yellow c
blue b
dtype: object
快到了!现在,使用 to_dict
在值"列上(这是将列a
设置为索引的地方):
Almost there! Now and use to_dict
on the 'value' column (this is where setting column a
as the index helps out):
series.to_dict()
这看起来像:
{'blue': 'b', 'red': 'd', 'yellow': 'c'}
我想这就是您要寻找的.作为单线:
Which I think is what you are looking for. As a one-liner:
df.set_index('a').idxmax(axis=1).to_dict()
这篇关于将 pandas 数据框转换为字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!