使用Python脚本复制大文件 [英] Copying a huge file using Python scripts

查看:98
本文介绍了使用Python脚本复制大文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用下面的python函数制作文件副本,并将其作为Azure数据工厂管道中数据提取的一部分进行处理.这适用于小文件,但无法处理大文件而不会返回任何错误.调用此函数处理2.2 GB文件时,它会在写入107 KB数据后停止执行,而不会引发任何异常.在这里发布

I am using below python function to make a copy of file, which I am processing as part of data ingestion in Azure datafactory pipelines . This works for for small files, but fails to process huge files without returning any errors .On calling this function for 2.2 GB file , it stops the execution after writing 107 KB of data without throwing any exceptions .Can anyone point out what could be the issue here

 with open(Temp_File_Name,encoding='ISO-8859-1') as a, open(Load_File_Name, 'w') as b:
  for line in a:    
    if ' blah blah ' in line:
      var=line[34:42]
    data = line[0:120] + var + '\n'
    b.write(data)

我在这里使用的输入和输出位置是Azure Blob存储中的文件.我正在采用这种方法,因为我需要读取每一行并在读取后执行一些操作

The input and output location , I have used here are files in Azure Blob storage .I am following this approach , as I need to read each line and perform some operation after reading it

推荐答案

您的代码段浪费了计算/时间,将文件加载到ram并将其写回到磁盘.使用底层操作系统通过pythons pathlib shutil 复制文件.

Your snippet wastes compute/time loading the file into the ram and writing it back to disk. Use the underlying OS to copy the file using pythons pathlib and shutil.

请查看此 stackoverflow帖子.这是最重要的答案:

Have a look at this stackoverflow post. Here's the top answer:

import pathlib
import shutil

my_file = pathlib.Path('/etc/hosts')
to_file = pathlib.Path('/tmp/foo')

shutil.copy(str(my_file), str(to_file))  # For older Python.
shutil.copy(my_file, to_file)  # For newer Python.

这篇关于使用Python脚本复制大文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆