将字典的JSON嵌套列表转换为DataFrame [英] Convert JSON nested list of dict to DataFrame

查看:1039
本文介绍了将字典的JSON嵌套列表转换为DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何将下面的字典(json输出)列表转换为pandas DataFrame。我尝试过

How can I convert the following list of dicts (json output) to a pandas DataFrame. I tried

res = {} 
for d in list_of_dict: 
    res.update(d)

它给了我错误:

ValueError: dictionary update sequence element #0 has length 33; 2 is required

示例JSON输出,需要转换为DataFrame。

Example JSON output, needed converted to DataFrame.

{
    "PlanCoverages": [
        {
            "PlanId": 65860,
            "FormularyId": 61855,
            "PlanName": "CVS Caremark Performance Standard Control w/Advanced Specialty Control",
            "PlanTypeId": 15,
            "ChannelId": 1,
            "ProductId": 237171,
            "MonthId": 202002,
            "ControllerId": 884,
            "NoteId": null,
            "Lives": 3814196,
            "DrugListTierId": 2,
            "DrugListTierName": "Not reimbursed",
            "FormularyTierId": 26,
            "FormularyUnifiedTierId": 13,
            "UnifiedTierId": 11,
            "UnifiedTierName": "Not Covered",
            "UnifiedTierShortName": "Not Covered",
            "UnifiedTierSort": 11,
            "PromotionalTierId": null,
            "IsGeneric": false,
            "PriorAuthorization": false,
            "OtherNote": false,
            "StepTherapy": false,
            "QuantityLimit": false,
            "Variance": false,
            "Restrictions": "",
            "CoveredAlternatives": 88,
            "RecommendedAlternatives": 0,
            "SpecialtyPharmacy": false,
            "ConditionalPriorAuthorization": false,
            "DurableMedicalEquipment": false,
            "MedicalBenefit": false,
            "OverTheCounter": false
        },
        {
            "PlanId": 69549,
            "FormularyId": 63811,
            "PlanName": "CVS Caremark Performance Standard Opt-Out w/ Advanced Specialty Control ",
            "PlanTypeId": 15,
            "ChannelId": 1,
            "ProductId": 237171,
            "MonthId": 202002,
            "ControllerId": 884,
            "NoteId": null,
            "Lives": 1460242,
            "DrugListTierId": 2,
            "DrugListTierName": "Not reimbursed",
            "FormularyTierId": 26,
            "FormularyUnifiedTierId": 13,
            "UnifiedTierId": 11,
            "UnifiedTierName": "Not Covered",
            "UnifiedTierShortName": "Not Covered",
            "UnifiedTierSort": 11,
            "PromotionalTierId": null,
            "IsGeneric": false,
            "PriorAuthorization": false,
            "OtherNote": false,
            "StepTherapy": false,
            "QuantityLimit": false,
            "Variance": false,
            "Restrictions": "",
            "CoveredAlternatives": 121,
            "RecommendedAlternatives": 0,
            "SpecialtyPharmacy": false,
            "ConditionalPriorAuthorization": false,
            "DurableMedicalEquipment": false,
            "MedicalBenefit": false,
            "OverTheCounter": false
        } ]
}

这是我的完整代码。它连接到API,并获取药品信息。
我需要1330个计划的PlanCoverages。

Here' my full code. It connects to an API, and scraped information on pharmaceuticals. I need the PlanCoverages of 1330 plans.

import requests
import pandas as pd
from pandas.io.json import json_normalize 
import json

headers = {
    'Accept': '*/*',
    'X-Requested-With': 'XMLHttpRequest',
    'Access-Token': 'H-oa4ULGls2Cpu8U6hX4myixRoFIPxfj',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.87 Safari/537.36',
    'Is-Session-Expired': 'false',
    'Referer': 'https://formularylookup.com/',
}

response = requests.get('https://formularylookup.com/Formulary/Coverage/Controller?ProductId=237171&ProductName=Rybelsus&ControllerId=884&ChannelId=1&StateId=all&DrugTypeId=3&Options=PlanCoverages', headers=headers)
df = response.json()

df_normal =  json_normalize(df)["PlanCoverages"]#["ControllerCoverages"]
#dff = pd.DataFrame(df_normal)

#dff = json.dumps(df, indent=4, sort_keys=False)

res = {} 
for d in df_normal: 
    res.update(d)

print(res)

理想的输出是,每个计划1行。因此共有1330行。

Ideal output is, 1 row per plan. So a total of 1330 rows.

推荐答案

类似的方法将起作用:

我假设您的json对象是一个名为'data'的大字符串。

I have assumed your json object is one large string named 'data'.

import pandas as pd    
import json

# json object:
json_string = """ { "PlanCoverages": [ { "PlanId": 65860, ... """

# 1) load json object as python variable:
data = json.loads(json_string)

# 2) convert to dataframe:
plan_coverages = pd.DataFrame(data['PlanCoverages'])

这篇关于将字典的JSON嵌套列表转换为DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆