在python列表中的tems上分组 [英] grouping on tems in a list in python

查看:187
本文介绍了在python列表中的tems上分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有60条记录,其中包含"skillsList",("skillsList"是技能列表")和"IdNo"列. 我想找出多少个"IdNo"具有共同的技能.

I have 60 records with a column "skillsList" "("skillsList" is a list of skills) and "IdNo". I want to find out how many "IdNo's" have a skill in common.

如何在python中做到这一点.我不知道如何计算特定列表项.不胜感激.

How can I do it in python. I am not knowing how to take the count of a particular list item. Would appreciate any help.

>>> a = open("C:\Users\abc\Desktop\Book2.csv")
>>> type(a1)
<type 'str'>

我打印a1时出现了一些文字

Some of the text when I print a1

>>> a1
'IdNo, skillsList\n1,"u\'Training\', u\'E-Learning\', u\'PowerPoint\', u\'Teaching\', u\'Accounting\', u\'Team Management\', u\'Team Building\', u\'Microsoft Excel\', u\'Microsoft Office\', u\'Financial Accounting\', u\'Microsoft Word\', u\'Customer Service\'"\n2,"u\'Telecommunications\', u\'Data Center\', u\'ISO 27001\', u\'Management\', u\'BS25999\', u\'Technology\', u\'Information Technology...\', u\'Certified PMP\\xae\', u\'Certified BS25999 Lead...\'"\n3,"u\'Market Research\', u\'Segmentation\', u\'Marketing Strategy\', u\'Consumer Behavior\', u\'Experience Working with...\'"

谢谢

推荐答案

您可以建立反向的技能索引.因此,您使用每个键作为技能名称来构建字典,并且键的值是一组IdNo.这样,您还可以找出哪些IdNo具有一些技能

You can build a inverted index of skills. So you build a dictionary with each key as a skill name and the value of the key is a set of IdNo. That way you can also find out which IdNos have some set of skills

代码看起来像

skills = {}
with open('filename.txt') as f:
    for line in f.readlines():
        items = [item.strip() for item in line.split(',')]
        idNo = items[0]
        skill_list = items[1:]
        for skill in skill_list:
            if skill in skills:
                skills[skill].add(idNo)
            else:
                skills[skill] = set([idNo, ])

现在您有了skills字典,该字典看起来像

Now you have skills dictionary which would look like

skills = {
    'Training': set(1,2,3),
    'Powerpoint': set(1,3,4),
    'E-learning': set(9,10,11),
    .....,
    .....,

}

现在您看到1,3,4具有Powerpoint作为一项技能,如果您想知道同时具有'Training'和'Powerpoint'技能的idNo,您可以做到

Now you see that 1,3,4 have Powerpoint as a skill and if you want to know idNo who have both 'Training' and 'Powerpoint' skills you can do

skills['Powerpoint'].intersection(skills['Training'])

,如果您想了解idNo谁具有培训"或"Powerpoint"技能,可以这样做

and if you want to know idNo who have either 'Training' or 'Powerpoint' skills you can do

skills['Powerpoint'].union(skills['Training'])

这篇关于在python列表中的tems上分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆