python:从字典中创建直方图 [英] python: creating histogram out of dictionary

查看:275
本文介绍了python:从字典中创建直方图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是新来的python,正在学习如何做正确的方式。



我有字典列表 d 。每个字典都代表用户,并且包含诸如user_id,age等信息。这个列表 d 可以包含代表同一用户的几个字典(但是与稍微不同的信息无关紧要为我的目的)。我想创建直方图,显示在给定年龄的 d 中有多少用户。如何以有效的方式?



编辑:
我想强调,我需要删除列表中的重复项。

解决方案

嗯,这个问题的经典方法是创建一个defaultdict:

  import collections 
histogram = collections.defaultdict(int)

然后重复列表中的字典,并使用 d_list 而不是 d 作为字典列表的名称),

$ b

  d d中的d:
直方图[d ['age']] + = 1

但是,您还包含了混淆我的其他信息。你表示多个裁判可以代表同一个用户。要从柱状图中消除这些重复项吗?如果这是你的问题,一种方法是使用(firstname,lastname)将用户存储在 user_records 元组作为键。然后代表相同用户的连续字典将彼此粉碎,每个用户只能保留一个记录。然后迭代 字典中的值(可能使用 user_records.itervalues())。



这种一般方法可以修改为使用每个记录中的任何值最好地标识唯一用户。如果 user_id 值对于每个用户是唯一的,则使用它作为密钥而不是(firstname,lastname)。但是您的问题提出(对我来说),对于两个相同的用户, user_id 不一定是一样的。



尽管如此,如果您使用Python> = 2.7,那么还有一个快捷方式:

  histogram = collections.Counter(d ['age'] for d in user_records.itervalues())

一些示例代码...说我们有一个 record_list

  >>> record_list 
[{'lastname':'Mann','age':23,'firstname':'Joe'},
{'lastname':'Moore','age':23,名字':'亚历克斯'},
{'lastname':'Sault','age':33,'firstname':'Marie'},
{'lastname':'Mann' age':23,'firstname':'Joe'}]
>>> user_ages = dict((d ['firstname'],d ['lastname']),d ['age'])for d in record_list)
>>>> user_ages
{('Joe','Mann'):23,('Alex','Moore'):23,('Marie','Sault'):33}

如您所见, record_list 有重复,但 user_ages dict没有。现在,通过一个计数器运行值来简单地计算一下年龄。

 >>> collections.Counter(user_ages.itervalues())
计数器({23:2,33:1})

可以使用任何可以作为特定用户的唯一标识符的字符串或不可变对象来完成同样的事情。


I am new to python and am learning how to do things the right way.

I have list of dictionaries d. Each dictionary represents users, and contains information like user_id, age, etc. This list d can contain several dictionaries that represent the same user (but with slightly different information that does not matter for my purposes). I want to create histogram that shows how many users are in d with given age. How to do it in efficient way?

Edit: I want to emphasise that I need to eliminate duplicates in the list.

解决方案

Well, the classic approach to this problem would be to create a defaultdict:

import collections
histogram = collections.defaultdict(int)

Then iterate over the dictionaries in the list, and (using d_list instead of d as the name of the list of dictionaries),

for d in d_list:
    histogram[d['age']] += 1

But you included additional information that confuses me. You said multiple dicts could represent the same user. Do you want to eliminate those duplicates from the histogram? If that's your question, one approach would be to store the users in a dict of user_records using (firstname, lastname) tuples as keys. Then successive dictionaries representing the same user would smash one another and only one record per user would be preserved. Then iterate over the values in that dictionary (perhaps using user_records.itervalues()).

This general approach can be modified to use whatever values in each record best identifies unique users. If the user_id value is unique per user, then use that as the key instead of (firstname, lastname). But your question suggested (to me) that the user_id wouldn't necessarily be the same for two users who are the same.

Once you have the eliminated duplicates, though, there's also a shortcut if you're using Python >= 2.7:

histogram = collections.Counter(d['age'] for d in user_records.itervalues())

Some example code... say we have a record_list:

>>> record_list
[{'lastname': 'Mann', 'age': 23, 'firstname': 'Joe'}, 
 {'lastname': 'Moore', 'age': 23, 'firstname': 'Alex'}, 
 {'lastname': 'Sault', 'age': 33, 'firstname': 'Marie'}, 
 {'lastname': 'Mann', 'age': 23, 'firstname': 'Joe'}]
>>> user_ages = dict(((d['firstname'], d['lastname']), d['age']) for d in record_list)
>>> user_ages
{('Joe', 'Mann'): 23, ('Alex', 'Moore'): 23, ('Marie', 'Sault'): 33}

As you can see, the record_list has a duplicate, but the user_ages dict doesn't. Now getting a count of ages is as simple as running the values through a Counter.

>>> collections.Counter(user_ages.itervalues())
Counter({23: 2, 33: 1})

The same thing can be done with any string or immutable object that can serve as a unique identifier of a particular user.

这篇关于python:从字典中创建直方图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆