在Python中,我如何自然地对字母数字字符串列表进行排序,以使字母字符排在数字字符之前? [英] In Python, how can I naturally sort a list of alphanumeric strings such that alpha characters sort ahead of numeric characters?

查看:316
本文介绍了在Python中,我如何自然地对字母数字字符串列表进行排序,以使字母字符排在数字字符之前?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我最近遇到的一个有趣的小挑战.我将在下面提供我的答案,但我很想知道是否有更优雅或更有效的解决方案.

This is a fun little challenge that confronted me recently. I'll provide my answer below, but I'm curious to see whether there are more elegant or efficient solutions.

对提出给我的要求的描述:

A delineation of the requirements as they were presented to me:

  1. 字符串是字母数字(请参见下面的测试数据集)
  2. 字符串应自然排序(有关说明,请参见此问题)
  3. 字母字符应排在数字字符之前(即"abc"在"100"之前)
  4. alpha字符的大写实例应排在小写实例(即'ABc','Abc','abc')之前
  1. Strings are alphanumeric (see test dataset below)
  2. Strings should be sorted naturally (see this question for explanation)
  3. Alpha characters should be sorted ahead of numeric characters (i.e. 'abc' before '100')
  4. Uppercase instances of alpha chars should be sorted ahead of lowercase instances (i.e. 'ABc', 'Abc', 'abc')

这是一个测试数据集:

test_cases = [
    # (unsorted list, sorted list)
    (list('bca'), ['a', 'b', 'c']),
    (list('CbA'), ['A', 'b', 'C']),
    (list('r0B9a'), ['a', 'B', 'r', '0', '9']),
    (['a2', '1a', '10a', 'a1', 'a100'], ['a1', 'a2', 'a100', '1a', '10a']),
    (['GAM', 'alp2', 'ALP11', '1', 'alp100', 'alp10', '100', 'alp1', '2'],
        ['alp1', 'alp2', 'alp10', 'ALP11', 'alp100', 'GAM', '1', '2', '100']),
    (list('ra0b9A'), ['A', 'a', 'b', 'r', '0', '9']),
    (['Abc', 'abc', 'ABc'], ['ABc', 'Abc', 'abc']),
]


奖励测试用例

这是受

This is inspired by Janne Karila's comment below that the selected answer currently fails (but wouldn't really be a practical concern in my case):

(['0A', '00a', 'a', 'A', 'A0', '00A', '0', 'a0', '00', '0a'],
        ['A', 'a', 'A0', 'a0', '0', '00', '0A', '00A', '0a', '00a'])

推荐答案

re_natural = re.compile('[0-9]+|[^0-9]+')

def natural_key(s):
    return [(1, int(c)) if c.isdigit() else (0, c.lower()) for c in re_natural.findall(s)] + [s]

for case in test_cases:
    print case[1]
    print sorted(case[0], key=natural_key)

['a', 'b', 'c']
['a', 'b', 'c']
['A', 'b', 'C']
['A', 'b', 'C']
['a', 'B', 'r', '0', '9']
['a', 'B', 'r', '0', '9']
['a1', 'a2', 'a100', '1a', '10a']
['a1', 'a2', 'a100', '1a', '10a']
['alp1', 'alp2', 'alp10', 'ALP11', 'alp100', 'GAM', '1', '2', '100']
['alp1', 'alp2', 'alp10', 'ALP11', 'alp100', 'GAM', '1', '2', '100']
['A', 'a', 'b', 'r', '0', '9']
['A', 'a', 'b', 'r', '0', '9']
['ABc', 'Abc', 'abc']
['ABc', 'Abc', 'abc']

我决定重新考虑这个问题,看看是否有可能处理奖金案.它要求在钥匙的决胜局部分上更加复杂.为了匹配期望的结果,必须在数字部分之前考虑键的字母部分.我还在键的自然部分和决胜局之间添加了一个标记,以使短键始终位于长键之前.

I decided to revisit this question and see if it would be possible to handle the bonus case. It requires being more sophisticated in the tie-breaker portion of the key. To match the desired results, the alpha parts of the key must be considered before the numeric parts. I also added a marker between the natural section of the key and the tie-breaker so that short keys always come before long ones.

def natural_key2(s):
    parts = re_natural.findall(s)
    natural = [(1, int(c)) if c.isdigit() else (0, c.lower()) for c in parts]
    ties_alpha = [c for c in parts if not c.isdigit()]
    ties_numeric = [c for c in parts if c.isdigit()]
    return natural + [(-1,)] + ties_alpha + ties_numeric

对于上面的测试案例,这将产生相同的结果,再加上奖励案例,将产生所需的输出:

This generates identical results for the test cases above, plus the desired output for the bonus case:

['A', 'a', 'A0', 'a0', '0', '00', '0A', '00A', '0a', '00a']

这篇关于在Python中,我如何自然地对字母数字字符串列表进行排序,以使字母字符排在数字字符之前?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆