从 pandas 数据框中提取字典值 [英] Extracting dictionary values from a pandas dataframe

查看:92
本文介绍了从 pandas 数据框中提取字典值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从从.json文件导入的数据集中增加功能.

I need to extra a features from a dataset I imported from a .json file.

这是它的样子:

f1 = pd.read_json('https://raw.githubusercontent.com/ansymo/msr2013-bug_dataset/master/data/v02/eclipse/short_desc.json')

print(f1.head())


                                               short_desc
1       [{'when': 1002742486, 'what': 'Usability issue...
10      [{'when': 1002742495, 'what': 'API - VCM event...
100     [{'when': 1002742586, 'what': 'Would like a wa...
10000   [{'when': 1014113227, 'what': 'getter/setter c...
100001  [{'when': 1118743999, 'what': 'Create Help Ind...

本质上,我需要将'short_desc'作为列名,并在其正下方使用字符串值进行填充:'可用性问题...

In essence, I need to take 'short_desc' as the column name, and populate it with the string values directly below it: 'Usability issue...

到目前为止,我已经尝试了以下方法:

So far, I've tried the following:

f1['desc'] = pd.DataFrame([x for x in f1['short_desc']])

Wrong number of items passed 19, placement implies 1

是否有一种简单的方法可以在不使用循环的情况下完成此任务?有人可以指出这个新手正确的方向吗?

Is there an easy way to accomplish this without the use of loops? Could someone point this newbie in the right direction?

推荐答案

请勿初始化数据框,并尝试将其分配给列-列应为pd.Series.

Don't initialise a dataframe and try to assign it to a column - columns are meant to be pd.Series.

您应该直接分配列表理解力,就像这样:

You should just assign the list comprehension directly, like this:

f1['desc'] = [x[0]['what'] for x in f1['short_desc']]


作为替代方案,我将使用operatorpd.Series.apply提出不涉及任何lambda函数的解决方案:


As an alternative, I would propose a solution not involving any lambda functions, using operator and pd.Series.apply:

import operator

f1['desc'] = f1.short_desc.apply(operator.itemgetter(0))\
                             .apply(operator.itemgetter('what'))
print(f1.desc.head())

1           Usability issue with external editors (1GE6IRL)
10                   API - VCM event notification (1G8G6RR)
100       Would like a way to take a write lock on a tea...
10000     getter/setter code generation drops "F" in ".....
100001    Create Help Index Fails with seemingly incorre...
Name: desc, dtype: object

这篇关于从 pandas 数据框中提取字典值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆