复杂的嵌套字典 [英] Complex nested dictionaries

查看:71
本文介绍了复杂的嵌套字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到的问题是我无法找出为什么我的代码也无法以我想要的方式输出。这可能与我对字典的理解或代码中的逻辑有关。有人可以协助我获取这些嵌套词典吗?
CSV链接:
我要查找的输出是:

 西非:{
2005:{
人口年增长率(百分比):2.6,
总生育率(每名妇女的子女数):6,
男女婴儿死亡率(每1,000活产婴儿):95.7,
男女出生时预期寿命(年):49.3,
男性出生时预期寿命(年):48.4,
女性的预期寿命(年):50.2
},
2010:{
< data>
},
2015:{
< data>
}


解决方案

我强烈建议您使用



或者您也可以遍历所有唯一值:

  unique_regions = df ['Region / Country / Area']。unique()

使用该子数据框,您可以构建



然后,您可以将新的数据框转换为字典:

  values = wa1.to_dict('records')

并获得

$的索引列表b
$ b

 索引= wa1.index 

这两个列表可用于为每个区域构建字典:

  d = {key:(key的值,值(以邮编(索引,值)为单位)} 

{2005:{'男女的婴儿死亡率(每1000活产)':95.700000000000003,
'两者的预期寿命性别(年)':49.299999999999997,
'女性出生时的预期寿命(年)':50.200000000000003,
'男性出生时的预期寿命(年)':48.399999999999999,
'人口年增长率(百分比)':2.6000000000000001,
'总生育率(每名妇女的子女数)':6.0},

2010年:{'男女婴儿死亡率(每1000活出生)':82.700000000000003,
'男女均预期寿命(岁)':52.299999999999997,
'女性预期寿命(岁)':53 .200000000000003,
'男性预期寿命(岁)':51.5,
'人口年增长率(百分比)':2.7000000000000002,
'总生育率(每名妇女的子女数) )':5.7999999999999998},

2015:{'男女婴儿死亡率(每1000活产)':70.5,
'男女出生时的预期寿命(岁)': 54.700000000000003,
'女性预期寿命(岁)':55.600000000000001,
'男性预期寿命(岁)':53.899999999999999,
'人口年增长率(百分比)':2.7000000000000002,
'总生育率(每个妇女的子女数)':5.5}}

最后,您可以使用另一个循环为每个区域构建一个包含项的列表或字典。



摘要



作为总结,您可以使用 pandas 来减少代码,

 将熊猫作为pd导入
文件名='dph_SYB60_T03_人口增长,生育力和死亡率指标.csv'
df_total = pd.read_csv(文件名,数千= r',')
区域= df_total ['地区/国家/地区'] .unique()
out = {}
用于区域中的reg:
df_region = df_total [df_total ['Region / Country / Area'] == reg]
枢纽= df_region.pivot_table(index ='Year',columns ='Series',values ='Value')
values_by_year = ivot.to_dict('records')
data_reg = {key:(key的值,value)in zip(pivot.index,values_by_year)}
out [reg] = data_reg
out

此代码提供了您要查找的嵌套字典。

  {'阿富汗' :{2005:{男女婴儿死亡率(每1,000活产婴儿):89.5,
男女出生时的预期寿命(年):56.899999999999999,
婴儿的预期寿命女性(年)':58.100000000000001,
'男性出生时的预期寿命(年)':55.799999999999997,
'孕产妇死亡率(每10万人的死亡)':821.0,
'人口年增长率(%)':4.4000000000000004,
'总生育率(每名妇女的子女)':7.2000000000000002},
2010:{'男女婴儿死亡率(每1000活产婴儿)':76.700000000000003,
'男女(岁)':60.0,
'女性出生时的预期寿命(年)':61.299999999999997,
'男性出生时的预期寿命(年)':58.899999999999999,
'孕产妇死亡率(每十万人口的死亡)':584.0,
'人口年增长率(百分比)':2.7999999999999998,
'总受精率(每名妇女的子女数)':6.4000000000000004},
2015:{'男女婴儿死亡率(每1000活产)':68.599999999999994,
'男女出生时的预期寿命(岁) ':62.299999999999997,
'女性预期寿命(岁)':63.5,
'男性预期寿命(岁)':61.100000000000001,
'孕产妇死亡率(死亡)每十万人口)':396.0,
人口年增长率(百分比):3.2000000000000002,
总生育率(每名妇女的子女数):5.2999999999999998}},
非洲':< DATA>,



‘津巴布韦’:< DATA>}


The problem I'm encountering is I cant find out why my code does not output the same way as I want it too. It might have to do with my understanding of dictionaries or the logic in my code. Can someone offer assistance in how I can get these nested dictionaries? LINK TO CSV : https://docs.google.com/document/d/1v68_QQX7Tn96l-b0LMO9YZ4ZAn_KWDMUJboa6LEyPr8/edit?usp=sharing

import csv
data_by_region = {}
data_by_country = {}
answers = []
data = []

countries = False
f = open('dph_SYB60_T03_Population Growth, Fertility and Mortality Indicators.csv')

reader = csv.DictReader(f)

for line in reader:
  #This gets all the values into a standard dict
  data.append(dict(line))  

#This will loop thru the dict and create variables to hold specific items
for i in data: 
  # collects all of the Region/Country/Area
  places = i['Region/Country/Area'] 
  # Gets All the Years
  years = i['Year']
  i_d = i['ID']
  info = i['Footnotes']
  series = i['Series']
  value = float(i['Value'])
  # print(series)
  stats = {i['Series']:i['Value']}
  # print(stats)


  if (i['ID']== '4'):
    countries = True
  if countries == True:
    if places not in data_by_country:
      data_by_country[places] = {}
    if years not in data_by_country:
      data_by_country[places][years] = {}
      data_by_country[places][years].update(stats)
    # if series not in data_by_country:
    #   data_by_country[places][years][series] = {}
    # if value not in data_by_country:
    #   data_by_country[places][years][series] = value
  else:
    if places not in data_by_region:
      data_by_region[places] = {}
    if years not in data_by_region:
      data_by_region[places][years] = {}
      data_by_region[places][years] = stats
    # if series not in data_by_region:
    #   data_by_region[places][series] = series
    # # if value not in data_by_region:
    #   data_by_region[places][years][series] = value


print(data_by_region['Western Africa'])

The data I'm outputting in the above code isn't the same. The output I'm going for is :

"Western Africa" : {
2005: {
    "Population annual rate of increase (percent)": 2.6,
"Total fertility rate (children per women)": 6,
"Infant mortality for both sexes (per 1,000 live births)": 95.7,
"Life expectancy at birth for both sexes (years)": 49.3,
"Life expectancy at birth for males (years)": 48.4,
"Life expectancy at birth for females (years)": 50.2
},
2010: { 
    <data>
    },
2015: {
    <data>
    }

解决方案

I strongly recommend you to use pandas package. It's possible that you reach your goal using this package that is designed specifically to manage the kind of information that you have with a lot of functions to analyze and visualize.

For example, you can read your file this way:

import pandas as pd
filename = 'dph_SYB60_T03_Population Growth, Fertility and Mortality Indicators.csv'
df = pd.read_csv(filename)

In your case you also needs to add "," as thousands separator:

df = pd.read_csv(filename, thousands=r',')

This gives you a kind of object (dataframe) with your information organized by columns that you can manage or convert it in many ways as a dictionary or use directly to reach your goal.

You can get all the data for an ID:

df[df['ID'] == 4]

Or by a specific region.

wa = df[df['Region/Country/Area'] == 'Western Africa']

Or you can loop through all unique values:

unique_regions = df['Region/Country/Area'].unique()

With that sub-dataframe you can build a pivot table this way:

wa1 = pd.pivot_table(wa, index='Year', columns='Series', values='Value')

Then, you can convert that new dataframe in a dictionary:

values = wa1.to_dict('records')

And get a list of indexes with

indexes = wa1.index

Those two lists can be used to build a dictionary for each region:

d = {key: value for (key, value) in zip(indexes, values)}

{2005: {'Infant mortality for both sexes (per 1,000 live births)': 95.700000000000003,
'Life expectancy at birth for both sexes (years)': 49.299999999999997,
'Life expectancy at birth for females (years)': 50.200000000000003,
'Life expectancy at birth for males (years)': 48.399999999999999,
'Population annual rate of increase (percent)': 2.6000000000000001,
'Total fertility rate (children per women)': 6.0},

2010: {'Infant mortality for both sexes (per 1,000 live births)': 82.700000000000003,
'Life expectancy at birth for both sexes (years)': 52.299999999999997,
'Life expectancy at birth for females (years)': 53.200000000000003,
'Life expectancy at birth for males (years)': 51.5,
'Population annual rate of increase (percent)': 2.7000000000000002,
'Total fertility rate (children per women)': 5.7999999999999998},

2015: {'Infant mortality for both sexes (per 1,000 live births)': 70.5,
'Life expectancy at birth for both sexes (years)': 54.700000000000003,
'Life expectancy at birth for females (years)': 55.600000000000001,
'Life expectancy at birth for males (years)': 53.899999999999999,
'Population annual rate of increase (percent)': 2.7000000000000002,
'Total fertility rate (children per women)': 5.5}}

And finally, You can use another loop to build a list or a dictionary with an item for each region.

Summary

As a summary, you can reduce your code using pandas to:

import pandas as pd
filename = 'dph_SYB60_T03_Population Growth, Fertility and Mortality Indicators.csv'
df_total = pd.read_csv(filename, thousands=r',')
regions = df_total['Region/Country/Area'].unique()
out = {}
for reg in regions:
    df_region = df_total[df_total['Region/Country/Area'] == reg]
    pivot = df_region.pivot_table(index='Year', columns='Series', values='Value')
    values_by_year = pivot.to_dict('records') 
    data_reg = {key: value for (key, value) in zip(pivot.index, values_by_year)}
    out[reg] = data_reg
out

This code has an out with the nested dictionaries that are you looking for.

{'Afghanistan': {2005: {'Infant mortality for both sexes (per 1,000 live births)': 89.5,
                        'Life expectancy at birth for both sexes (years)': 56.899999999999999,
                        'Life expectancy at birth for females (years)': 58.100000000000001,
                        'Life expectancy at birth for males (years)': 55.799999999999997,
                        'Maternal mortality ratio (deaths per 100,000 population)': 821.0,
                        'Population annual rate of increase (percent)': 4.4000000000000004,
                        'Total fertility rate (children per women)': 7.2000000000000002},
                 2010: {'Infant mortality for both sexes (per 1,000 live births)': 76.700000000000003,
                        'Life expectancy at birth for both sexes (years)': 60.0,
                        'Life expectancy at birth for females (years)': 61.299999999999997,
                        'Life expectancy at birth for males (years)': 58.899999999999999,
                        'Maternal mortality ratio (deaths per 100,000 population)': 584.0,
                        'Population annual rate of increase (percent)': 2.7999999999999998,
                        'Total fertility rate (children per women)': 6.4000000000000004},
                 2015: {'Infant mortality for both sexes (per 1,000 live births)': 68.599999999999994,
                        'Life expectancy at birth for both sexes (years)': 62.299999999999997,
                        'Life expectancy at birth for females (years)': 63.5,
                        'Life expectancy at birth for males (years)': 61.100000000000001,
                        'Maternal mortality ratio (deaths per 100,000 population)': 396.0,
                        'Population annual rate of increase (percent)': 3.2000000000000002,
                        'Total fertility rate (children per women)': 5.2999999999999998}},
 'Africa': <DATA>,
 .
 .
 . 
 'Zimbabwe': <DATA>}

这篇关于复杂的嵌套字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆