Python中的运行长度编码 [英] Run length encoding in Python

查看:61
本文介绍了Python中的运行长度编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个简单的 python 算法来解决这个问题.你能帮我弄清楚为什么我的代码不起作用:

I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:

问题:

如果任何字符重复超过 4 次,则整个集合重复的字符应替换为斜杠/",后跟一个2 位数字,这是这组重复字符的长度,和性格.例如,aaaaa"将被编码为/05a".自执行以来不应替换 4 个或更少字符的运行编码不会减少字符串的长度.

If any character is repeated more than 4 times, the entire set of repeated characters should be replaced with a slash '/', followed by a 2-digit number which is the length of this run of repeated characters, and the character. For example, "aaaaa" would be encoded as "/05a". Runs of 4 or less characters should not be replaced since performing the encoding would not decrease the length of the string.

我的代码:

def runLengthEncode (plainText):
    res=''
    a=''
    for i in plainText:
        if a.count(i)>0:
            a+=i
        else:
            if len(a)>4:
                res+="/" + str(len(a)) + a[0][:1]
            else:
                res+=a
                a=i
    return(res)

推荐答案

我在这里看到了很多很棒的解决方案,但没有一个让我觉得非常 Pythonic.所以我正在贡献我今天为这个问题写的一个实现.

I see many great solutions here but none that feels very pythonic to my eyes. So I'm contributing with a implementation I wrote myself today for this problem.

def run_length_encode(data: str) -> Iterator[Tuple[str, int]]:
    """Returns run length encoded Tuples for string"""
    # A memory efficient (lazy) and pythonic solution using generators
    return ((x, sum(1 for _ in y)) for x, y in groupby(data))

这将返回一个带有字符和实例数的元组生成器,但也可以很容易地修改为返回一个字符串.这样做的一个好处是,如果您不需要用尽整个搜索空间,那么它都是惰性计算的,并且不会消耗比需要更多的内存或 cpu.

This will return a generator of Tuples with the character and number of instances, but can easily be modified to return a string as well. A benefit of doing it this way is that it's all lazy evaluated and won't consume more memory or cpu than needed if you don't need to exhaust the entire search space.

如果您仍然想要字符串编码,可以很容易地针对该用例修改代码,如下所示:

If you still want string encoding the code can quite easily be modified for that use case like this:

def run_length_encode(data: str) -> str:
    """Returns run length encoded string for data"""
    # A memory efficient (lazy) and pythonic solution using generators
    return "".join(f"{x}{sum(1 for _ in y)}" for x, y in groupby(data))

这是一种更通用的运行长度编码,适用于所有长度,而不仅仅是超过 4 个字符的编码.但是,如果需要,这也可以很容易地通过字符串的条件进行调整.

This is a more generic run length encoding for all lengths, and not just for those of over 4 characters. But this could also quite easily be adapted with a conditional for the string if wanted.

这篇关于Python中的运行长度编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆