如何找到一个很长的字符串的所有唯一子？ [英] How to find all the unique substrings of a very long string?

查看：109 发布时间：2015/11/30 15:37:58 python string algorithm memory large-data

本文介绍了如何找到一个很长的字符串的所有唯一子？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个很长的字符串。我想找到这个字符串的所有唯一子。我试着写了code，我用的设置的（蟒蛇）来存储所有的子串，以确保其唯一性。我收到了很多大中型字符串然而，在案件的非常大的字符串，我得到一个的MemoryError正确的结果。我用Google搜索了一下，发现原来的设置的在Python数据结构有一个大的内存占用，也许这就是为什么我收到的MemoryError。

I have a very long string. I want to find all the unique substrings of this string. I tried to write the code where I used a set(python) to store all the substrings to ensure uniqueness. I am getting correct result for many medium and large strings however in case of very large strings, I am getting a MemoryError. I googled a bit and found out that the set data structure in python has a large RAM footprint and maybe thats why I am getting a MemoryError.

下面是我的code：

a = set()
for i in range(n):
    string = raw_input()
    j = 1
    while True:
        for i in xrange(len(string)-j+1):   
            a.add(string[i:i+j])
        if j==len(string):   break
        j+=1
print sorted(list(a))

有没有办法避免这种错误对于大字符串？或者，有谁能够提出一个更好的修改在我的code来处理这个问题呢？

Is there a way to avoid this error for large strings? Or can anybody suggest a better modification in my code to handle this issue?

PS：我DONOT有32位和64位版本之间进行切换的选项

P.S: I donot have an option of shifting between 32 bit and 64 bit versions.

如何找到一个很长的字符串的所有唯一子？ [英] How to find all the unique substrings of a very long string?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何找到一个很长的字符串的所有唯一子？ [英] How to find all the unique substrings of a very long string?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭