json_normalize会产生令人困惑的KeyError [英] json_normalize produces confusing KeyError
问题描述
我正在尝试将嵌套的json转换为熊猫数据帧。我一直在使用json_normalize成功,直到我遇到某个json。我已经做了一个较小的版本来重新创建问题。
从pandas.io.json import json_normalize
json = [{events:[{schedule:{date:2015-08-27,
location:{building:BDC floor:5},
ID:815},
group:A},
{schedule:{date 27,
location:{building:BDC,floor:5},
ID:816},
group:A} ]}]
我跑了:
json_normalize(JSON [0], '事件',[[ '日程', '日期'],[ '日程', '位置', '建筑'],[ '日程', 'location','floor']])
希望看到这样的东西:
ID组schedule.date schedule.location.building schedule.location.floor
'815''A''2015-08- 27BDC'5
'816''A''2015-08-27''BDC '5
但是我收到这个错误:
在[2]中:json_normalize(json [0],'events',[['schedule','date'],['schedule','location','building '],['schedule','location','floor']])
---------------------------- -----------------------------------------------
KeyError Traceback(最近的最后一次调用)
< ipython-input-2-b588a9e3ef1d>在< module>()
----> 1 json_normalize(json_normalize(json [0],'events',[['schedule','date'],['schedule','location','building'],['schedule','location','floor' ])
/Users/logan/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas/io/json.pyc在json_normalize(data,record_path,meta,meta_prefix ,record_prefix)
739 records.extend(recs)
740
- > 741 _recursive_extract(data,record_path,{},level = 0)
742
743 result = DataFrame(records)
/ Users / logan / Library / Enthought / Canopy_64bit / User /lib/python2.7/site-packages/pandas/io/json.pyc在_recursive_extract(data,path,seen_meta,level)
734 meta_val = seen_meta [key]
735 else:
- > 736 meta_val = _pull_field(obj,val [level:])
737 meta_vals [key] .append(meta_val)
738
/ Users / logan / Library / Enthought / Canopy_64bit /User/lib/python2.7/site-packages/pandas/io/json.pyc in _pull_field(js,spec)
674 if isinstance(spec,list):
675 for field in spec:
- > 676 result = result [field]
677 else:
678 result = result [spec]
KeyError:'schedule'
/ pre>
在这种情况下,我想你只是使用这个:
在[57]中:json_normalize(data [0] ['events'])
Out [57]:
group schedule.ID schedule。日期schedule.location.building \
0 A 815 2015-08-27 BDC
1 A 816 2015-08-27 BDC
schedule.location.floor
0 5
1 5
meta
路径( [['schedule','date'] ...]
)用于指定与记录相同的嵌套级别的数据,即在与事件相同。它不像 json_normalize
特别适用嵌套列表的列表,所以如果实际数据复杂得多,您可能需要进行一些手动整形。
I'm trying to convert a nested json to a pandas dataframe. I've been using json_normalize with success until I came across a certain json. I've made a smaller version of it to re-create the problem.
from pandas.io.json import json_normalize
json=[{"events": [{"schedule": {"date": "2015-08-27",
"location": {"building": "BDC", "floor": 5},
"ID": 815},
"group": "A"},
{"schedule": {"date": "2015-08-27",
"location": {"building": "BDC", "floor": 5},
"ID": 816},
"group": "A"}]}]
I ran:
json_normalize(json[0],'events',[['schedule','date'],['schedule','location','building'],['schedule','location','floor']])
Expecting to see something like this:
ID group schedule.date schedule.location.building schedule.location.floor
'815' 'A' '2015-08-27' 'BDC' 5
'816' 'A' '2015-08-27' 'BDC' 5
But instead I got this error:
In [2]: json_normalize(json[0],'events',[['schedule','date'],['schedule','location','building'],['schedule','location','floor']])
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-2-b588a9e3ef1d> in <module>()
----> 1 json_normalize(json[0],'events',[['schedule','date'],['schedule','location','building'],['schedule','location','floor']])
/Users/logan/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas/io/json.pyc in json_normalize(data, record_path, meta, meta_prefix, record_prefix)
739 records.extend(recs)
740
--> 741 _recursive_extract(data, record_path, {}, level=0)
742
743 result = DataFrame(records)
/Users/logan/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas/io/json.pyc in _recursive_extract(data, path, seen_meta, level)
734 meta_val = seen_meta[key]
735 else:
--> 736 meta_val = _pull_field(obj, val[level:])
737 meta_vals[key].append(meta_val)
738
/Users/logan/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas/io/json.pyc in _pull_field(js, spec)
674 if isinstance(spec, list):
675 for field in spec:
--> 676 result = result[field]
677 else:
678 result = result[spec]
KeyError: 'schedule'
In this case, I think you'd just use this:
In [57]: json_normalize(data[0]['events'])
Out[57]:
group schedule.ID schedule.date schedule.location.building \
0 A 815 2015-08-27 BDC
1 A 816 2015-08-27 BDC
schedule.location.floor
0 5
1 5
The meta
paths ([['schedule','date']...]
) are for specifying data at the same level of nesting as your records, i.e. at the same level as 'events'. It doesn't look like json_normalize
handles dicts with nested lists particularly well, so you may need to do some manual reshaping if your actual data is much more complicated.
这篇关于json_normalize会产生令人困惑的KeyError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!