两个字符串的字母/字典序平均 [英] Average of two strings in alphabetical/lexicographical order

查看:119
本文介绍了两个字符串的字母/字典序平均的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设你把字符串'a'和'Z',并列出所有来到他们之间按字母顺序排列的字符串:['A','B','C'...'X','Y' ,'Z']。就拿这个列表的中点,你会发现M。因此,这是一种像取平均值这两个字符串。

Suppose you take the strings 'a' and 'z' and list all the strings that come between them in alphabetical order: ['a','b','c' ... 'x','y','z']. Take the midpoint of this list and you find 'm'. So this is kind of like taking an average of those two strings.

您可以用多于一个的字符延伸为字符串,例如AA和ZZ'之间的中点将列表中的[AA,AB,交流的中间找到.. 'ZX','ZY','ZZ']

You could extend it to strings with more than one character, for example the midpoint between 'aa' and 'zz' would be found in the middle of the list ['aa', 'ab', 'ac' ... 'zx', 'zy', 'zz'].

可能会有一个Python方法的地方这样做吗?如果没有,甚至不知道该算法的名称会有所帮助。

Might there be a Python method somewhere that does this? If not, even knowing the name of the algorithm would help.

我开始做我自己的程序,仅仅经过两个字符串和发现的第一个不同的字母,这似乎在说'AA'工作的伟大和AZ中点是'我'的中点,但随后失败的猫,它认为是'C''小狗'中点。我想谷歌搜索二进制搜索字符串中点等,但不知道什么,我想在这里做我有一点运气的名字。

我说我自己的解决方案作为一个答案

推荐答案

如果你定义字符的字母,你可以转换成10进制,做一个平均,并将其转换回基地-n,其中n的大小字母表。

If you define an alphabet of characters, you can just convert to base 10, do an average, and convert back to base-N where N is the size of the alphabet.

alphabet = 'abcdefghijklmnopqrstuvwxyz'

def enbase(x):
    n = len(alphabet)
    if x < n:
        return alphabet[x]
    return enbase(x/n) + alphabet[x%n]

def debase(x):
    n = len(alphabet)
    result = 0
    for i, c in enumerate(reversed(x)):
        result += alphabet.index(c) * (n**i)
    return result

def average(a, b):
    a = debase(a)
    b = debase(b)
    return enbase((a + b) / 2)

print average('a', 'z') #m
print average('aa', 'zz') #mz
print average('cat', 'doggie') #budeel
print average('google', 'microsoft') #gebmbqkil
print average('microsoft', 'google') #gebmbqkil

修改:基于注释和其他答案,你可能想通过追加拼音的第一个字母的短词来处理不同长度的字符串,直到它们是相同的长度。这将导致在一个词典编纂排序的平均落入两个输入之间。 code的变化和新的输出如下:

Edit: Based on comments and other answers, you might want to handle strings of different lengths by appending the first letter of the alphabet to the shorter word until they're the same length. This will result in the "average" falling between the two inputs in a lexicographical sort. Code changes and new outputs below.

def pad(x, n):
    p = alphabet[0] * (n - len(x)) 
    return '%s%s' % (x, p)

def average(a, b):
    n = max(len(a), len(b))
    a = debase(pad(a, n))
    b = debase(pad(b, n))
    return enbase((a + b) / 2)

print average('a', 'z') #m
print average('aa', 'zz') #mz
print average('aa', 'az') #m (equivalent to ma)
print average('cat', 'doggie') #cumqec
print average('google', 'microsoft') #jlilzyhcw
print average('microsoft', 'google') #jlilzyhcw

这篇关于两个字符串的字母/字典序平均的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆