pandas |将具有类似列表/数组的字段的json文件读取到布尔列 [英] pandas | Read json file with list/array-like fields to Boolean columns

查看:54
本文介绍了 pandas |将具有类似列表/数组的字段的json文件读取到布尔列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个JSON字符串,其中包含一个对象列表,每个对象都嵌入了另一个列表.

Here is a JSON string that contains a list of objects with each having another list embedded.

[
  {
    "name": "Alice",
    "hobbies": [
      "volleyball",
      "shopping",
      "movies"
    ]
  },
  {
    "name": "Bob",
    "hobbies": [
      "fishing",
      "movies"
    ]
  }
]

使用pandas.read_json()可以将其转换为如下所示的DataFrame:

Using pandas.read_json() this turns into a DataFrame like this:

  name      hobbies
  --------------------------------------
1 Alice     [volleyball, shopping, movies]
2 Bob       [fishing, movies]

但是,我想将列表展平为这样的布尔列:

However, I would like to flatten the lists into Boolean columns like this:

  name      volleyball  shopping    movies  fishing 
  ----------------------------------------------------
1 Alice     True        True        True    False
2 Bob       False       False       True    True

即当列表包含值时,对应列中的字段将填充布尔值True,否则将填充False.

I.e. when the list contains a value, the field in the corresponding column is filled with a Boolean True, otherwise with False.

我也研究了pandas.io.json.json_normalize(),但这似乎也不支持这个想法.是否有任何内置方式(Python3或pandas)来执行此操作?

I have also looked into pandas.io.json.json_normalize(), but that does not seem support this idea either. Is there any built-in way (either Python3, or pandas) to do this?

(PS.我意识到,您可以在将整个列表加载到DataFrame中之前编写自己的代码以规范化"字典对象,但是我可能会对此进行重新发明,并且效率可能很低方式).

推荐答案

您可以使用 crosstab ,由bool" rel ="nofollow"> astype :

You can use crosstab with cast to bool by astype:

df = pd.io.json.json_normalize(data, 'hobbies', ['name']).rename(columns={0:'hobby'})
print df
        hobby   name
0  volleyball  Alice
1    shopping  Alice
2      movies  Alice
3     fishing    Bob
4      movies    Bob

print pd.crosstab(df.name, df.hobby).astype(bool)

hobby fishing movies shopping volleyball
name                                    
Alice   False   True     True       True
Bob      True   True    False      False

这篇关于 pandas |将具有类似列表/数组的字段的json文件读取到布尔列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆