如何将字典的文本文件读取到DataFrame中 [英] How to Read a Text File of Dictionaries into a DataFrame

查看:129
本文介绍了如何将字典的文本文件读取到DataFrame中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个来自皇家冲突统计数据Kaggle的文本文件.它采用Python字典的格式.我正在努力寻找如何以有意义的方式将其读取到文件中的方法.好奇这样做的最佳方法是什么.这是带有列表的相当复杂的Dict.

I have a text file from Kaggle of Clash Royale stats. It's in a format of Python Dictionaries. I am struggling to find out how to read that into a file in a meaningful way. Curious what the best way is to do this. It's a fairly complex Dict with Lists.

原始数据集在这里: https://www.kaggle.com/s1m0n38/clash-royale-matches-数据集

{'players': {'right': {'deck': [['Mega Minion', '9'], ['Electro Wizard', '3'], ['Arrows', '11'], ['Lightning', '5'], ['Tombstone', '9'], ['The Log', '2'], ['Giant', '9'], ['Bowler', '5']], 'trophy': '4258', 'clan': 'TwoFiveOne', 'name': 'gpa raid'}, 'left': {'deck': [['Fireball', '9'], ['Archers', '12'], ['Goblins', '12'], ['Minions', '11'], ['Bomber', '12'], ['The Log', '2'], ['Barbarians', '12'], ['Royal Giant', '13']], 'trophy': '4325', 'clan': 'battusai', 'name': 'Supr4'}}, 'type': 'ladder', 'result': ['2', '0'], 'time': '2017-07-12'}
{'players': {'right': {'deck': [['Ice Spirit', '10'], ['Valkyrie', '9'], ['Hog Rider', '9'], ['Inferno Tower', '9'], ['Goblins', '12'], ['Musketeer', '9'], ['Zap', '12'], ['Fireball', '9']], 'trophy': '4237', 'clan': 'The Wolves', 'name': 'TITAN'}, 'left': {'deck': [['Royal Giant', '13'], ['Ice Wizard', '2'], ['Bomber', '12'], ['Knight', '12'], ['Fireball', '9'], ['Barbarians', '12'], ['The Log', '2'], ['Archers', '12']], 'trophy': '4296', 'clan': 'battusai', 'name': 'Supr4'}}, 'type': 'ladder', 'result': ['1', '0'], 'time': '2017-07-12'}
{'players': {'right': {'deck': [['Miner', '3'], ['Ice Golem', '9'], ['Spear Goblins', '12'], ['Minion Horde', '12'], ['Inferno Tower', '8'], ['The Log', '2'], ['Skeleton Army', '6'], ['Fireball', '10']], 'trophy': '4300', 'clan': '@LA PERLA NEGRA', 'name': 'Victor'}, 'left': {'deck': [['Royal Giant', '13'], ['Ice Wizard', '2'], ['Bomber', '12'], ['Knight', '12'], ['Fireball', '9'], ['Barbarians', '12'], ['The Log', '2'], ['Archers', '12']], 'trophy': '4267', 'clan': 'battusai', 'name': 'Supr4'}}, 'type': 'ladder', 'result': ['0', '1'], 'time': '2017-07-12'}

推荐答案

我将您的数据保存到 .json 文件中,然后只需要遍历每一行并将其视为自己的JSON文件,然后我使用了 pandas.json_normalize 将其加载到 DataFrame 中,我对您希望df的外观做出了一些猜测,但我想到了:

I saved your data to .json files, then just needed to loop through each line and treat it as it's own JSON file, then I used pandas.json_normalize to load it into a DataFrame and I made some guesses at how you wanted the df to look but I came up with this:

注意: 正确的 JSON 必须具有双引号而不是单引号,因此我使用replace来解决此问题.请注意,使用此操作不会破坏内部数据.

note: proper JSON needs to have double quotes not single so I used replace to work around this. Be careful that no data inside is destroyed using this.

注意: 我要将此方法工作,必须合并'right''left'正在丢失此数据.如果需要,您可以使用dict comp作为解决方法

note: The way I got this to work, I had to merge 'right' and 'left' so you are losing this data. If this is needed you could use a dict comp as a workaround

import json
import pandas as pd

with open('cr.json', 'r') as f:
    df = None
    for line in f:
        data = json.loads(line.replace("'", '"'))
        #needed to put the right and left keys together, maybe you can find a way around this, I wasn't
        df1 = pd.json_normalize([data['players']['right'], data['players']['left']],
                     'deck',
                     ['name', 'trophy', 'clan'],
                     meta_prefix='player.',
                     errors='ignore')
        df = pd.concat([df, df1])
    df.rename(columns={0: 'player.troop.name', 1: 'player.troop.level'}, 
              inplace=True)
    print(df)

打印:

   player.troop.name player.troop.level player.name      player.clan  \
0        Mega Minion                  9    gpa raid       TwoFiveOne   
1     Electro Wizard                  3    gpa raid       TwoFiveOne   
2             Arrows                 11    gpa raid       TwoFiveOne   
3          Lightning                  5    gpa raid       TwoFiveOne   
4          Tombstone                  9    gpa raid       TwoFiveOne   
5            The Log                  2    gpa raid       TwoFiveOne   
6              Giant                  9    gpa raid       TwoFiveOne   
7             Bowler                  5    gpa raid       TwoFiveOne   
8           Fireball                  9       Supr4         battusai   
9            Archers                 12       Supr4         battusai   
10           Goblins                 12       Supr4         battusai   
11           Minions                 11       Supr4         battusai   
12            Bomber                 12       Supr4         battusai   
13           The Log                  2       Supr4         battusai   
14        Barbarians                 12       Supr4         battusai   
15       Royal Giant                 13       Supr4         battusai   
0         Ice Spirit                 10       TITAN       The Wolves   
1           Valkyrie                  9       TITAN       The Wolves   
2          Hog Rider                  9       TITAN       The Wolves   
3      Inferno Tower                  9       TITAN       The Wolves   
4            Goblins                 12       TITAN       The Wolves   
5          Musketeer                  9       TITAN       The Wolves   
6                Zap                 12       TITAN       The Wolves   
7           Fireball                  9       TITAN       The Wolves   
8        Royal Giant                 13       Supr4         battusai   
9         Ice Wizard                  2       Supr4         battusai   
10            Bomber                 12       Supr4         battusai   
11            Knight                 12       Supr4         battusai   
12          Fireball                  9       Supr4         battusai   
13        Barbarians                 12       Supr4         battusai   
14           The Log                  2       Supr4         battusai   
15           Archers                 12       Supr4         battusai   
0              Miner                  3      Victor  @LA PERLA NEGRA   
1          Ice Golem                  9      Victor  @LA PERLA NEGRA   
2      Spear Goblins                 12      Victor  @LA PERLA NEGRA   
3       Minion Horde                 12      Victor  @LA PERLA NEGRA   
4      Inferno Tower                  8      Victor  @LA PERLA NEGRA   
5            The Log                  2      Victor  @LA PERLA NEGRA   
6      Skeleton Army                  6      Victor  @LA PERLA NEGRA   
7           Fireball                 10      Victor  @LA PERLA NEGRA   
8        Royal Giant                 13       Supr4         battusai   
9         Ice Wizard                  2       Supr4         battusai   
10            Bomber                 12       Supr4         battusai   
11            Knight                 12       Supr4         battusai   
12          Fireball                  9       Supr4         battusai   
13        Barbarians                 12       Supr4         battusai   
14           The Log                  2       Supr4         battusai   
15           Archers                 12       Supr4         battusai   

   player.trophy  
0           4258  
1           4258  
2           4258  
3           4258  
4           4258  
5           4258  
6           4258  
7           4258  
8           4325  
9           4325  
10          4325  
11          4325  
12          4325  
13          4325  
14          4325  
15          4325  
0           4237  
1           4237  
2           4237  
3           4237  
4           4237  
5           4237  
6           4237  
7           4237  
8           4296  
9           4296  
10          4296  
11          4296  
12          4296  
13          4296  
14          4296  
15          4296  
0           4300  
1           4300  
2           4300  
3           4300  
4           4300  
5           4300  
6           4300  
7           4300  
8           4267  
9           4267  
10          4267  
11          4267  
12          4267  
13          4267  
14          4267  
15          4267

并且 df.iloc [0] 如下:

player.troop.name Mega Minion
player.troop.level         9
player.name         gpa raid
player.trophy           4258
player.clan       TwoFiveOne
Name: 0, dtype: object

您可以按照自己认为合适的方式对 json_normalize 参数进行修改,但我希望这足以使您前进

You can rework the json_normalize parameters how you see fit, but I hope this is more than enough to get you going

这篇关于如何将字典的文本文件读取到DataFrame中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆