Python-将文件从HTTP(S)URL传输到FTP / Dropbox,而无需磁盘写入(分块上传) [英] Python - Transfer a file from HTTP(S) URL to FTP/Dropbox without disk writing (chunked upload)

查看:119
本文介绍了Python-将文件从HTTP(S)URL传输到FTP / Dropbox,而无需磁盘写入(分块上传)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在HTTP(S)位置上存储了一个大文件(500 Mb-1Gb)

(例如 https://example.com/largefile.zip )。

I have a large file (500 Mb-1Gb) stored on a HTTP(S) location
(say https://example.com/largefile.zip).

我对FTP服务器具有读/写访问权限

I have read/write access to an FTP server

我正常用户权限(无sudo)。

I have normal user permissions (no sudo).

在这些限制下,我想通过请求从HTTP URL读取文件,然后将其发送到FTP服务器,而无需先写入磁盘。

Within these constraints I want to read the file from the HTTP URL via requests and send it to the FTP server without writing to disk first.

通常,我会这样做。

response=requests.get('https://example.com/largefile.zip', stream=True)
with open("largefile_local.zip", "wb") as handle:                                                                                                     
 for data in response.iter_content(chunk_size=4096):
  handle.write(data)     

然后上传本地文件到FTP。但是我想避免磁盘I / O。我不能将FTP作为保险丝文件系统挂载,因为我没有超级用户权限。

and then upload the local file to FTP. But I want to avoid the disk I/O. I cannot mount the FTP as a fuse filesystem because I don't have super user rights.

理想情况下,我会做类似 ftp_file.write( )而不是 handle.write()。那可能吗? ftplib文档似乎假设仅本地文件将被上传,而不是 response.content 。因此,理想情况下,我想这样做

Ideally I would do something like ftp_file.write() instead of handle.write(). Is that possible? The ftplib documentation seems to assume only local files will be uploaded, not response.content. So ideally I would like to do

response=requests.get('https://example.com/largefile.zip', stream=True)
for data in response.iter_content(chunk_size=4096):
 ftp_send_chunk(data)   

我不确定如何写 ftp_send_chunk()

有一个这里有类似的问题( Python-将内存中文件(由API调用生成)按块上传到FTP中)。我的用例需要从HTTP URL中检索一个块并将其写入FTP。

There is a similar question here (Python - Upload a in-memory file (generated by API calls) in FTP by chunks). My use case requires retrieving a chunk from the HTTP URL and writing it to FTP.

PS:答案中提供的解决方案(围绕urllib.urlopen的包装器)可以使用投寄箱也可以上传。我在使用ftp提供程序时遇到问题,因此最终使用了dropbox,它可以可靠地工作。

P.S.: The solution provided in the answer (wrapper around urllib.urlopen) will work with dropbox uploads as well. I had problems working with my ftp provider ,so finally used dropbox, which is working reliably.

请注意,Dropbox在api中具有添加网络上传功能同一件事(远程上传)。仅适用于直接链接。在我的用例中,http_url来自i.p.的流服务。受限制的。因此,此变通办法变得必要。
这是代码

Note that Dropbox has a "add web upload" feature in the api which does the same thing (remote upload). That only works with "direct" links. In my use case the http_url came from a streaming service that was i.p. restricted. So this workaround became necessary. Here's the code

import dropbox;
d = dropbox.Dropbox(<ACTION-TOKEN>);
f=FileWithProgress(filehandle);
filesize=filehandle.length;
targetfile='/'+fname;
CHUNK_SIZE=4*1024*1024
upload_session_start_result = d.files_upload_session_start(f.read(CHUNK_SIZE));
num_chunks=1
cursor = dropbox.files.UploadSessionCursor(session_id=upload_session_start_result.session_id,
                                           offset=CHUNK_SIZE*num_chunks)
commit = dropbox.files.CommitInfo(path=targetfile)
while CHUNK_SIZE*num_chunks < filesize:
 if ((filesize - (CHUNK_SIZE*num_chunks)) <= CHUNK_SIZE):
  print d.files_upload_session_finish(f.read(CHUNK_SIZE),cursor,commit)
 else:
  d.files_upload_session_append(f.read(CHUNK_SIZE),cursor.session_id,cursor.offset)
 num_chunks+=1
cursor.offset = CHUNK_SIZE*num_chunks
link = d.sharing_create_shared_link(targetfile)  
url = link.url
dl_url = re.sub(r"\?dl\=0", "?dl=1", url)
dl_url = dl_url.strip()
print 'dropbox_url: ',dl_url;

我想Google甚至可以通过他们的python api使用驱动器来做到这一点,但是使用python包装器的凭据对我来说太难了。选中此 1 和此 2

I think it should even be possible to do this with google-drive via their python api , but using credentials with their python wrapper is too hard for me. Check this1 and this2

推荐答案

使用 urllib.request.urlopen ,因为它会返回类似文件的对象,您可以直接与 FTP.storbinary

ftp = FTP(host, user, passwd)

filehandle = urllib.request.urlopen(http_url)

ftp.storbinary("STOR /ftp/path/file.dat", filehandle)






如果要监视进度,请实现包装文件l ike对象将委派对 filehandle 对象的调用,但还将显示进度:


If you want to monitor progress, implement a wrapper file-like object that will delegate calls to filehandle object, but will also display the progress:

class FileWithProgress:

    def __init__(self, filehandle):
        self.filehandle = filehandle
        self.p = 0

    def read(self, blocksize):
        r = self.filehandle.read(blocksize)
        self.p += len(r)
        print(str(self.p) + " of " + str(self.p + self.filehandle.length)) 
        return r

filehandle = urllib.request.urlopen(http_url)

ftp.storbinary("STOR /ftp/path/file.dat", FileWithProgress(filehandle))






对于Python 2使用:


For Python 2 use:


  • urllib.urlopen ,而不是 urllib.request.urlopen

  • filehandle.info()。getheader('Content-Length ')而不是 str(self.p + filehandle.length)

  • urllib.urlopen, instead of urllib.request.urlopen.
  • filehandle.info().getheader('Content-Length') instead of str(self.p + filehandle.length)

这篇关于Python-将文件从HTTP(S)URL传输到FTP / Dropbox,而无需磁盘写入(分块上传)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆