从ftp下载第二个文件失败 [英] Downloading second file from ftp fails
问题描述
我想从python下载FTP中的多个文件。我的代码工作时,我只是下载1个文件,但不适用于多个!
导入urllib
urllib .urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC1790863.tar.gz','file1.tar.gz')
urllib.urlretrieve ('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC2329613.tar.gz','file2.tar.gz')
一个错误说:
Traceback最近的最后一次调用):
在< module>文件中的/home/ehsan/dev_center/bigADEVS-bknd/daemons/crawler/ftp_oa_crawler.py,第3行。
urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC2329613.tar.gz','file2.tar.gz')
文件/usr/lib/python2.7/urllib.py,第98行,在urlretrieve
返回opener.retrieve(url,filename,reporthook,data)
文件/ usr / lib /python2.7/urllib.py,第245行,检索
fp = self.open(url,data)
文件/usr/lib/python2.7/urllib.py,行213,打开
返回getattr(self,name)(url)
文件/usr/lib/python2.7/urllib.py,第558行,在open_ftp
(fp, retrlen)= self.ftpcache [key] .retrfile(文件,类型)
文件/usr/lib/python2.7/urllib.py,行906,在retrfile中
conn,retrlen = self .ftp.ntransfercmd(cmd)
文件/usr/lib/python2.7/ftplib.py,行334,位于ntransfercmd
主机,port = self.makepasv()
文件/usr/lib/python2.7/ftplib.py,第312行,在makepasv
host,port = parse227(self.sendcmd('PASV'))
文件/ usr / lib / python2.7 / ftplib.py,第830行, arse227
raise error_reply,resp
IOError:[Errno ftp error] 200类型设置为I
我应该怎么做?
它是 urllib $ c中的一个错误$ c>在python 2.7中。已报告此处。 此处介绍了相同的原因。
现在,当用户试图从
同一目录下载同一文件或其他文件时,密钥(主机,端口,目录)保持不变,因此
open_ftp()跳过ftp初始化。由于这种跳过,
之前的FTP连接被重用,并且当新命令被发送到服务器的
时,服务器首先发送先前的ACK。这会导致一个多米诺骨牌
的效果,每个响应都会被延迟一个,并且我们会从parse227()中得到一个异常
()
可能的解决方案是清除先前调用可能已建立的缓存。您可以在 urlretrieve
调用之间使用 urllib.urlcleanup()
方法调用,如这里。
<希望这有助于!
I want to download multiple files from FTP in python. the my code works when I just download 1 file, but not works for more than one!
import urllib
urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC1790863.tar.gz', 'file1.tar.gz')
urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC2329613.tar.gz', 'file2.tar.gz')
An error say:
Traceback (most recent call last):
File "/home/ehsan/dev_center/bigADEVS-bknd/daemons/crawler/ftp_oa_crawler.py", line 3, in <module>
urllib.urlretrieve('ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC2329613.tar.gz', 'file2.tar.gz')
File "/usr/lib/python2.7/urllib.py", line 98, in urlretrieve
return opener.retrieve(url, filename, reporthook, data)
File "/usr/lib/python2.7/urllib.py", line 245, in retrieve
fp = self.open(url, data)
File "/usr/lib/python2.7/urllib.py", line 213, in open
return getattr(self, name)(url)
File "/usr/lib/python2.7/urllib.py", line 558, in open_ftp
(fp, retrlen) = self.ftpcache[key].retrfile(file, type)
File "/usr/lib/python2.7/urllib.py", line 906, in retrfile
conn, retrlen = self.ftp.ntransfercmd(cmd)
File "/usr/lib/python2.7/ftplib.py", line 334, in ntransfercmd
host, port = self.makepasv()
File "/usr/lib/python2.7/ftplib.py", line 312, in makepasv
host, port = parse227(self.sendcmd('PASV'))
File "/usr/lib/python2.7/ftplib.py", line 830, in parse227
raise error_reply, resp
IOError: [Errno ftp error] 200 Type set to I
What should I do?
It is a bug in urllib
in python 2.7. Reported here. The reason behind the same is explained here
Now, when a user tries to download the same file or another file from same directory, the key (host, port, dirs) remains the same so open_ftp() skips ftp initialization. Because of this skipping, previous FTP connection is reused and when new commands are sent to the server, server first sends the previous ACK. This causes a domino effect and each response gets delayed by one and we get an exception from parse227()
A possible solution is to clear the cache that may have been built up by previous calls. You may use the urllib.urlcleanup()
method calls between your urlretrieve
calls for the same, as mentioned here.
Hope this helps!
这篇关于从ftp下载第二个文件失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!