用最少的内存损耗在python中记录实时数据的最快方法是什么 [英] What is the fastest way to record real-time data in python with least memory loss
问题描述
在循环的每一步中,我都有一些要保存到硬盘中的数据.
In every step of a loop I have some data which I want to be saved in the end in my hard disk.
一种方法:
list = []
for i in range(1e10):
list.append(numpy_array_i)
pickle.dump(list, open(self.save_path, "wb"), protocol=4)
但是我担心:1_由于列表的原因,我的内存不足了.2_如果发生崩溃,所有数据都将丢失.因此,我还想到了一种实时保存数据的方法,例如:
But I worry: 1_I ran out of memory for because of the list 2_If something crashes all data will be lost. Because of this I have also thought of a way to save data in real time such as:
file = make_new_csv_or_xlsx_file()
for i in range(1e10):
file.write_in_a_new_line(numpy_array_i)
为此,我也担心它可能不会那么快,并且不确定最好的工具是什么.但是openpyxl可能是一个不错的选择.
For this also I worry it may not be so fast and am not sure what the best tools might be. But probably openpyxl is a good choice.
推荐答案
我会尝试使用SQLite,因为它提供了磁盘上的永久存储(->没有数据丢失),但是它比写入文件要快,如图所示.您的问题,并提供了更轻松的数据查找功能,以防您以前运行的数据不完整.
I'd try SQLite, as it provides permanent storage on disk (-> no data loss), and yet it's faster than writing into a file as shown in your question and provides easier data lookup in case you'd have incomplete data from previous run.
调整 JOURNAL_MODE
可以进一步提高性能: https://blog.devart.com/increasing-sqlite-performance.html
Tweaking JOURNAL_MODE
can increase the performance further: https://blog.devart.com/increasing-sqlite-performance.html
这篇关于用最少的内存损耗在python中记录实时数据的最快方法是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!