为列表中的每个唯一值分配一个数字 [英] Assign a number to each unique value in a list

查看:46
本文介绍了为列表中的每个唯一值分配一个数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串列表.我想为每个字符串分配一个唯一的数字(确切的数字并不重要),并依次使用这些数字创建一个长度相同的列表.以下是我的最佳尝试,但出于以下两个原因,我感到不满意:

I have a list of strings. I want to assign a unique number to each string (the exact number is not important), and create a list of the same length using these numbers, in order. Below is my best attempt at it, but I am not happy for two reasons:

  1. 它假定相同的值彼此相邻

  1. It assumes that the same values are next to each other

我必须以0开始列表,否则输出将不正确

I had to start the list with a 0, otherwise the output would be incorrect

我的代码:

names = ['ll', 'll', 'll', 'hl', 'hl', 'hl', 'LL', 'LL', 'LL', 'HL', 'HL', 'HL']
numbers = [0]
num = 0
for item in range(len(names)):
    if item == len(names) - 1:
      break
    elif names[item] == names[item+1]:
        numbers.append(num)
    else:
        num = num + 1
        numbers.append(num)
print(numbers)

我想使代码更通用,因此它将与未知列表一起使用.有什么想法吗?

I want to make the code more generic, so it will work with an unknown list. Any ideas?

推荐答案

无需使用外部库(检查 EDIT 以获取Pandas解决方案),您可以按照以下步骤进行操作:

Without using an external library (check the EDIT for a Pandas solution) you can do it as follows :

d = {ni: indi for indi, ni in enumerate(set(names))}
numbers = [d[ni] for ni in names]

简要说明:

在第一行中,为列表中的每个唯一元素分配一个数字(存储在字典d中;您可以使用字典理解轻松地创建它; set返回names的唯一元素) .

In the first line, you assign a number to each unique element in your list (stored in the dictionary d; you can easily create it using a dictionary comprehension; set returns the unique elements of names).

然后,在第二行中,进行列表理解,并将实际数字存储在列表numbers中.

Then, in the second line, you do a list comprehension and store the actual numbers in the list numbers.

一个示例说明它也可以用于未排序的列表:

One example to illustrate that it also works fine for unsorted lists:

# 'll' appears all over the place
names = ['ll', 'll', 'hl', 'hl', 'hl', 'LL', 'LL', 'll', 'LL', 'HL', 'HL', 'HL', 'll']

这是numbers的输出:

[1, 1, 3, 3, 3, 2, 2, 1, 2, 0, 0, 0, 1]

如您所见,与ll关联的数字1出现在正确的位置.

As you can see, the number 1 associated with ll appears at the correct places.

编辑

如果您有熊猫可用,则也可以使用此处):

If you have Pandas available, you can also use pandas.factorize (which seems to be quite efficient for huge lists and also works fine for lists of tuples as explained here):

import pandas as pd

pd.factorize(names)

然后将返回

(array([(array([0, 0, 1, 1, 1, 2, 2, 0, 2, 3, 3, 3, 0]),
 array(['ll', 'hl', 'LL', 'HL'], dtype=object))

因此

numbers = pd.factorize(names)[0]

这篇关于为列表中的每个唯一值分配一个数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆