使用Python脚本复制大文件 [英] Copying a huge file using Python scripts
问题描述
我正在使用下面的python函数制作文件副本,并将其作为Azure数据工厂管道中数据提取的一部分进行处理.这适用于小文件,但无法处理大文件而不会返回任何错误.调用此函数处理2.2 GB文件时,它会在写入107 KB数据后停止执行,而不会引发任何异常.在这里发布
I am using below python function to make a copy of file, which I am processing as part of data ingestion in Azure datafactory pipelines . This works for for small files, but fails to process huge files without returning any errors .On calling this function for 2.2 GB file , it stops the execution after writing 107 KB of data without throwing any exceptions .Can anyone point out what could be the issue here
with open(Temp_File_Name,encoding='ISO-8859-1') as a, open(Load_File_Name, 'w') as b:
for line in a:
if ' blah blah ' in line:
var=line[34:42]
data = line[0:120] + var + '\n'
b.write(data)
我在这里使用的输入和输出位置是Azure Blob存储中的文件.我正在采用这种方法,因为我需要读取每一行并在读取后执行一些操作
The input and output location , I have used here are files in Azure Blob storage .I am following this approach , as I need to read each line and perform some operation after reading it
推荐答案
您的代码段浪费了计算/时间,将文件加载到ram并将其写回到磁盘.使用底层操作系统通过pythons pathlib
和 shutil
复制文件.
Your snippet wastes compute/time loading the file into the ram and writing it back to disk. Use the underlying OS to copy the file using pythons pathlib
and shutil
.
请查看此 stackoverflow帖子.这是最重要的答案:
Have a look at this stackoverflow post. Here's the top answer:
import pathlib
import shutil
my_file = pathlib.Path('/etc/hosts')
to_file = pathlib.Path('/tmp/foo')
shutil.copy(str(my_file), str(to_file)) # For older Python.
shutil.copy(my_file, to_file) # For newer Python.
这篇关于使用Python脚本复制大文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!