在Python中写入巨大的字符串 [英] Writing huge strings in python
问题描述
file = open(file.txt,w)
file.write(string)
file.close()
作品太慢了,有没有办法写得更快?
我试图写一个数百万位的数字到一个文本文件
这个数字的顺序是数学.factorial(67867957)
这是在分析中显示的内容:
<$ p $在0.001秒内203个函数调用(198个原始调用)
命令:标准名称
ncalls tottime percall cumtime percall文件名:lineno(函数)
1 0.000 0.000 0.000 0.000< string>:1(< module>)
1 0.000 0.000 0.000 0.000 re.py:217(compile)
1 0.000 0.000 0.000 0.000 py:273(_compile)
1 0.000 0.000 0.000 0.000 sre_compile.py:172 (_compile_charset)
1 0.000 0.000 0.000 0.000 sre_compile.p y:201(_optimize_charset)
4 0.000 0.000 0.000 0.000 sre_compile.py:25(_identityfunction)
3/1 0.000 0.000 0.000 0.000 sre_compile.py:33(_compile)
1 0.000 0.000 0.000 0.000 sre_compile.py:341(_compile_info)
2 0.000 0.000 0.000 0.000 sre_compile.py:442(发带)
1 0.000 0.000 0.000 0.000 sre_compile.py:445(_code)
1 0.000 0.000 0.000 0.000 sre_compile.py:460(compile)
5 0.000 0.000 0.000 0.000 sre_parse.py:126 (__len__)
12 0.000 0.000 0.000 0.000 sre_parse.py:130(__getitem__)
7 0.000 0.000 0.000 0.000 sre_parse.py:138(append)
3/1 0.000 0.000 0.000 0.000 sre_parse.py:140(getwidth)
1 0.000 0.000 0.000 0.000 sre_parse.py:178(__init__)
10 0.000 0.000 0.000 0.000 sre_parse.py:183(_next)
2 0.0 00 0.000 0.000 0.000 sre_parse.py:202(match)
8 0.000 0.000 0.000 0.000 sre_parse.py:208(get)
1 0.000 0.000 0.000 0.000 sre_parse.py:351(_parse_sub)
2 0.000 0.000 0.000 0.000 sre_parse.py:429(_parse)
1 0.000 0.000 0.000 0.000 sre_parse.py:67(__init__)
1 0.000 0.000 0.000 0.000 sre_parse.py:726(fix_flags)
1 0.000 0.000 0.000 0.000 sre_parse.py:738(parse)
3 0.000 0.000 0.000 0.000 sre_parse.py:90(__init__)
1 0.000 0.000 0.000 0.000 {内置方法编译}
1 0.001 0.001 0.001 0.001 {内置方法exec}
17 0.000 0.000 0.000 0.000 {内置方法isinstance}
39/38 0.000 0.000 0.000 0.000 {内置方法len}
2 0.000 0.000 0.000 0.000 {最大内置法}
8 0.000 0.000 0.000 0.000 {内置方法min}
6 0.000 0.000 0.000 0.000 {内置方法ord}
48 0.000 0.000 0.000 0.000 {list'对象的方法'append'}
1 0.000 0.000 0.000 0.000 {_lsprof.Profiler对象的方法'disable'}
5 0.000 0.000 0.000 0.000 {'bytearray'对象的方法'find'
1 0.000 0.000 0.000 0.000 {方法' 'of'dict'objects}
在python中,
写入文件500MB不应该花费几个小时,例如:
$ python3 -c'open(file,w)。write(a* 500 * 1000000)'
几乎立即返回。 ls -l file
确认文件已被创建,并且具有预期的大小。
计算 math.factorial(67867957)
(结果有大约500M位)可能需要几个小时,但是使用 pickle
保存是瞬间的: p>
导入数学
导入pickle
n = math.factorial(67867957)#需要很长时间
打开(file.pickle,wb)作为文件:
pickle.dump(n,file)#非常快(比较)
使用 n = pickle.load(open('file.pickle','rb'))
占用不到一秒。
要快速获得十进制表示,您可以 我的机器只需不到10分钟。 $ b I have a very long string, almost a megabyte long, that I need to write to a text file. The regular works but is too slow, is there a way I can write faster? I am trying to write a several million digit number to a text file
the number is on the order of This is what shows on profiling:
Your issue is that Writing to a file 500MB should not take hours e.g.: returns almost immediately. Calculating To load it back using To get the decimal representation fast, you could use It takes less than 10 minutes on my machine. 这篇关于在Python中写入巨大的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
$ python -c'import gmpy2; open(file.gmpy2,w)。write(str(gmpy2.fac(67 867957)))'
file = open("file.txt","w")
file.write(string)
file.close()
math.factorial(67867957)
203 function calls (198 primitive calls) in 0.001 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 re.py:217(compile)
1 0.000 0.000 0.000 0.000 re.py:273(_compile)
1 0.000 0.000 0.000 0.000 sre_compile.py:172(_compile_charset)
1 0.000 0.000 0.000 0.000 sre_compile.py:201(_optimize_charset)
4 0.000 0.000 0.000 0.000 sre_compile.py:25(_identityfunction)
3/1 0.000 0.000 0.000 0.000 sre_compile.py:33(_compile)
1 0.000 0.000 0.000 0.000 sre_compile.py:341(_compile_info)
2 0.000 0.000 0.000 0.000 sre_compile.py:442(isstring)
1 0.000 0.000 0.000 0.000 sre_compile.py:445(_code)
1 0.000 0.000 0.000 0.000 sre_compile.py:460(compile)
5 0.000 0.000 0.000 0.000 sre_parse.py:126(__len__)
12 0.000 0.000 0.000 0.000 sre_parse.py:130(__getitem__)
7 0.000 0.000 0.000 0.000 sre_parse.py:138(append)
3/1 0.000 0.000 0.000 0.000 sre_parse.py:140(getwidth)
1 0.000 0.000 0.000 0.000 sre_parse.py:178(__init__)
10 0.000 0.000 0.000 0.000 sre_parse.py:183(__next)
2 0.000 0.000 0.000 0.000 sre_parse.py:202(match)
8 0.000 0.000 0.000 0.000 sre_parse.py:208(get)
1 0.000 0.000 0.000 0.000 sre_parse.py:351(_parse_sub)
2 0.000 0.000 0.000 0.000 sre_parse.py:429(_parse)
1 0.000 0.000 0.000 0.000 sre_parse.py:67(__init__)
1 0.000 0.000 0.000 0.000 sre_parse.py:726(fix_flags)
1 0.000 0.000 0.000 0.000 sre_parse.py:738(parse)
3 0.000 0.000 0.000 0.000 sre_parse.py:90(__init__)
1 0.000 0.000 0.000 0.000 {built-in method compile}
1 0.001 0.001 0.001 0.001 {built-in method exec}
17 0.000 0.000 0.000 0.000 {built-in method isinstance}
39/38 0.000 0.000 0.000 0.000 {built-in method len}
2 0.000 0.000 0.000 0.000 {built-in method max}
8 0.000 0.000 0.000 0.000 {built-in method min}
6 0.000 0.000 0.000 0.000 {built-in method ord}
48 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
5 0.000 0.000 0.000 0.000 {method 'find' of 'bytearray' objects}
1 0.000 0.000 0.000 0.000 {method 'items' of 'dict' objects}
str(long)
is very slow for large intergers (millions of digits) in Python. It is a quadratic operation (in number of digits) in Python i.e., for ~1e8 digits it may require ~1e16 operations to convert the integer to a decimal string.$ python3 -c 'open("file", "w").write("a"*500*1000000)'
ls -l file
confirms that the file is created and it has the expected size.math.factorial(67867957)
(the result has ~500M digits) may take several hours but saving it using pickle
is instantaneous:import math
import pickle
n = math.factorial(67867957) # takes a long time
with open("file.pickle", "wb") as file:
pickle.dump(n, file) # very fast (comparatively)
n = pickle.load(open('file.pickle', 'rb'))
takes less than a second.str(n)
is still running (after 50 hours) on my machine.gmpy2
:$ python -c'import gmpy2;open("file.gmpy2", "w").write(str(gmpy2.fac(67867957)))'