从FTP服务器上的gz文件检索数据,而无需在本地写入 [英] Retrieve data from gz file on FTP server without writing it locally
问题描述
我想检索存储在FTP服务器上的gz压缩文件中的数据,而无需将该文件写入本地档案.
I would like to retrieve the data inside a compressed gz file stored on an FTP server, without writing the file to the local archive.
目前我已经完成
from ftplib import FTP
import gzip
ftp = FTP('ftp.server.com')
ftp.login()
ftp.cwd('/a/folder/')
fileName = 'aFile.gz'
localfile = open(fileName,'wb')
ftp.retrbinary('RETR '+fileName, localfile.write, 1024)
f = gzip.open(localfile,'rb')
data = f.read()
但是,这会将文件"localfile"写入当前存储.
This, however, writes the file "localfile" on the current storage.
我试图在此更改
from ftplib import FTP
import zlib
ftp = FTP('ftp.server.com')
ftp.login()
ftp.cwd('/a/folder/')
fileName = 'aFile.gz'
data = ftp.retrbinary('RETR '+fileName, zlib.decompress, 1024)
,但是,ftp.retrbinary
不输出其回调的输出.
有办法吗?
but, ftp.retrbinary
does not output the output of its callback.
Is there a way to do this?
推荐答案
一个简单的实现是:
-
将文件下载到类似文件的内存中对象,例如将其传递给
fileobj参数rel ="nofollow noreferrer"> .构造函数 pass that to
fileobj
parameter ofGzipFile
constructor.import gzip from io import BytesIO import shutil from ftplib import FTP ftp = FTP('ftp.example.com') ftp.login('username', 'password') flo = BytesIO() ftp.retrbinary('RETR /remote/path/archive.tar.gz', flo.write) flo.seek(0) with open('archive.tar', 'wb') as fout, gzip.GzipFile(fileobj = flo) as gzip: shutil.copyfileobj(gzip, fout)
以上将整个.gz文件加载到内存中.大型文件可能没有效率.一个更聪明的实现将改为流式传输数据.但这可能需要实现一个智能的自定义文件状对象.
The above loads whole .gz file to a memory. What can be inefficient for large files. A smarter implementation would stream the data instead. But that would probably require implementing a smart custom file-like object.
另请参见在FTP服务器上的zip文件中获取文件名,而无需下载整个存档文件.
这篇关于从FTP服务器上的gz文件检索数据,而无需在本地写入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!