如何将列表中的相似项目分组? [英] How to group similar items in a list?

查看:103
本文介绍了如何将列表中的相似项目分组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望根据字符串中的前三个字符将相似的项目分组到一个列表中.例如:

I am looking to group similar items in a list based on the first three characters in the string. For example:

test = ['abc_1_2', 'abc_2_2', 'hij_1_1', 'xyz_1_2', 'xyz_2_2']

如何根据第一个字母分组(例如'abc')将上述列表项分组?以下是预期的输出:

How can I group the above list items into groups based on the first grouping of letters (e.g. 'abc')? The following is the intended output:

output = {1: ('abc_1_2', 'abc_2_2'), 2: ('hij_1_1',), 3: ('xyz_1_2', 'xyz_2_2')}

output = [['abc_1_2', 'abc_2_2'], ['hij_1_1'], ['xyz_1_2', 'xyz_2_2']]


我尝试使用itertools.groupby完成此操作,但未成功:


I have tried using itertools.groupby to accomplish this without success:

>>> import os, itertools
>>> test = ['abc_1_2', 'abc_2_2', 'hij_1_1', 'xyz_1_2', 'xyz_2_2']
>>> [list(g) for k.split("_")[0], g in itertools.groupby(test)]
[['abc_1_2'], ['abc_2_2'], ['hij_1_1'], ['xyz_1_2'], ['xyz_2_2']]


我查看了以下帖子,但没有成功:


I have looked at the following posts without success:

如何在列表中合并相似项.该示例使用对我的示例来说过于复杂的方法将相似的项(例如'house''Hose')分组.

How to merge similar items in a list. The example groups similar items (e.g. 'house' and 'Hose') using an approach that is overly complicated for my example.

如何将同等项目分组在一起在Python列表中?.这是我找到列表理解的想法的地方.

How can I group equivalent items together in a Python list?. This is where I found the idea for the list comprehension.

推荐答案

.split("_")[0]部分应在单参数函数内部,您应将该函数作为第二个参数传递给itertools.groupby.

The .split("_")[0] part should be inside a single-argument function that you pass as the second argument to itertools.groupby.

>>> import os, itertools
>>> test = ['abc_1_2', 'abc_2_2', 'hij_1_1', 'xyz_1_2', 'xyz_2_2']
>>> [list(g) for _, g in itertools.groupby(test, lambda x: x.split('_')[0])]
[['abc_1_2', 'abc_2_2'], ['hij_1_1'], ['xyz_1_2', 'xyz_2_2']]
>>>

将其保留在for ...部分中无济于事,因为结果将立即被丢弃.

Having it in the for ... part does nothing since the result is immediately discarded.

此外,在您使用 str.partition 时,效率会稍高一些只想要一个拆分:

Also, it would be slightly more efficient to use str.partition when you only want a single split:

[list(g) for _, g in itertools.groupby(test, lambda x: x.partition('_')[0])]

演示:

>>> from timeit import timeit
>>> timeit("'hij_1_1'.split('_')")
1.3149855638076913
>>> timeit("'hij_1_1'.partition('_')")
0.7576401470019234
>>>

这不是主要的问题,因为这两种方法在小字符串上都非常快,但我想我会提到它.

This isn't a major concern as both methods are pretty fast on small strings, but I figured I'd mention it.

这篇关于如何将列表中的相似项目分组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆