从数据框中的列中提取字典值 [英] Extract dictionary value from column in data frame
问题描述
我正在寻找一种优化我的代码的方法.
I'm looking for a way to optimize my code.
我以这种形式输入数据:
I have entry data in this form:
import pandas as pn
a=[{'Feature1': 'aa1','Feature2': 'bb1','Feature3': 'cc2' },
{'Feature1': 'aa2','Feature2': 'bb2' },
{'Feature1': 'aa1','Feature2': 'cc1' }
]
b=['num1','num2','num3']
df= pn.DataFrame({'num':b, 'dic':a })
我想从上面数据框中列"dic"(如果存在)中的字典中提取元素"Feature3".到目前为止,我已经能够解决它,但是我不知道这是否是最快的方法,这似乎有点复杂.
I would like to extract element 'Feature3' from dictionaries in column 'dic'(if exist) in above data frame. So far I was able to solve it but I don't know if this is the fastest way, it seems to be a little bit over complicated.
Feature3=[]
for idx, row in df['dic'].iteritems():
l=row.keys()
if 'Feature3' in l:
Feature3.append(row['Feature3'])
else:
Feature3.append(None)
df['Feature3']=Feature3
print df
是否有更好/更快/更简单的方法来提取Feature3来分离数据框中的列?
Is there a better/faster/simpler way do extract this Feature3 to separate column in the dataframe?
预先感谢您的帮助.
推荐答案
您可以使用列表推导从数据框中的每一行提取特征3,并返回列表.
You can use a list comprehension to extract feature 3 from each row in your dataframe, returning a list.
feature3 = [d.get('Feature3') for d in df.dic]
如果"Feature3"不在dic
中,则默认情况下返回None.
If 'Feature3' is not in dic
, it returns None by default.
您甚至不需要熊猫,因为您可以再次使用列表理解功能从原始词典a
中提取特征.
You don't even need pandas, as you can again use a list comprehension to extract the feature from your original dictionary a
.
feature3 = [d.get('Feature3') for d in a]
这篇关于从数据框中的列中提取字典值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!