“IOError:get 中的大小不匹配!";通过 SFTP 检索文件时 [英] "IOError: size mismatch in get!" when retrieving files via SFTP

查看:84
本文介绍了“IOError:get 中的大小不匹配!";通过 SFTP 检索文件时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个脚本,用于定期通过 SFTP 检索特定文件.有时,脚本会出现以下输出错误:

I have a script which I use to retrieve specific files via SFTP on a regular basis. On occasion, the script will error out with the following output:

Traceback (most recent call last):
  File "ETL.py", line 304, in <module>
    get_all_files(startdate, enddate, "vma" + 
foldernumber + "/logs/", txtype[1] + single_date2 + ".log", txtype[2] + 
foldernumber + "\\", sftp)
  File "ETL.py", line 283, in get_all_files
    sftp.get(sftp_dir + filename, local_dir + filename)
  File "C:\Python27\lib\site-packages\pysftp\__init__.py", line 249, in get
    self._sftp.get(remotepath, localpath, callback=callback)
  File "C:\Python27\lib\site-packages\paramiko\sftp_client.py", line 806, in get
    "size mismatch in get!  {} != {}".format(s.st_size, size)
IOError: size mismatch in get!  950272 != 1018742

我已经浏览了 Paramiko 文档,但没有看到有关触发此错误的原因的解释.此外,代码通常会在后续尝试中成功运行,或者会在日期范围内的前几个文件中成功运行,然后在下载我需要检索的所有文件的过程中出错.SO上的其他答案说这可能与驱动器上的可用空间有关,但我尝试清除目标文件夹,但没有帮助.如果有任何区别,我正在尝试下载到网络驱动器/云存储.

I have looked through the Paramiko documentation and do not see an explanation for what would trigger this error. Furthermore, the code often works successfully on subsequent tries, or will run successfully for the first few files in the date range and then error out in the middle of downloading all the files I need to retrieve. Other answers on SO say it might be related to the space available on the drive, but I have tried clearing out the destination folder and it hasn't helped. I am trying to download to a network drive/cloud storage if that makes any difference.

这是我用来检索文件的函数和代码(通过 Paramiko):

Here is the function and code I am using to retrieve the files (via Paramiko):

def get_all_files(start_date, end_date, sftp_dir, filename, local_dir,  \
                sftp_connection):

    sftp.get(sftp_dir + filename, local_dir + filename)

with pysftp.Connection('******.com', username='*****', password='******',  cnopts=cnopts) as sftp:
    get_all_files(startdate, enddate, "vma" + foldernumber + "/logs/", txtype[1] + single_date2 + ".log", txtype[2] + foldernumber + "\\", sftp)

我希望在不产生此错误的情况下检索所有可下载的文件.

I would like all downloadable files to be retrieved without producing this error.

推荐答案

错误信息 IOError: size mismatch in get!950272 != 1018742Paramiko-library 如果本地目录上复制文件的大小与远程文件的预取大小不匹配:

The error message IOError: size mismatch in get! 950272 != 1018742 is being thrown by the get-function of the Paramiko-library if the size of the copied file on the local directory does not match the prefetched size of the remote file:

with open(localpath, "wb") as fl:
    size = self.getfo(remotepath, fl, callback)
s = os.stat(localpath)
if s.st_size != size:
    raise IOError(
        "size mismatch in get!  {} != {}".format(s.st_size, size)
    )

如果连接和传输过程没有问题,为什么会发生这种情况?

在检查 Paramiko 代码并尝试调试此问题时,本地文件系统的一个奇怪行为引起了我的注意.对于从远程文件系统复制的每个文件,本地文件系统需要一些时间来处理注册正确文件大小的文件.

While checking the Paramiko-code and trying to debug this issue a strange behaviour of my local file system caught my attention. With every copied file from the remote file system, the local file system took some time processing the file registering the correct file-size.

这种行为使我做出假设,虽然 Paramiko-library 的 get 函数确实正确处理了文件,但它不会等待本地文件系统适应,因此可能会获得以下状态(包括大小)getfo 函数使用 s = os.stat(localpath) 处理完文件后的本地文件.

This behaviour leads me to my assumption, that while the get-function of the Paramiko-library does process the file correctly it does not wait for the local file system to adapt and hence may get the status (including the size) of the local file right after the file was finished being processed by the getfo-function using s = os.stat(localpath).

这可能会导致本地文件大小与正确预取的远程文件大小不一致,因此可能会在 get! 中抛出 IOError size mismatch!{} != {}".format(s.st_size, size).

This could lead to inconsistencies between the local file-size and the correctly prefetched remote file-size and therefore could throw the IOError "size mismatch in get! {} != {}".format(s.st_size, size).

这也解释了为什么不能一致地重现错误,因为 Python 解释器总是在不同的环境下工作,因为本地操作系统的同步性.

It would also explain why the Error cannot be reproduced consistently because the Python interpreter always works with different environments regarding the synchronicity of the local operating system.

我是如何为我解决这个问题的?

我操作了 get-function 的 Paramiko 代码,它可以在sftp_client.py"中的第 785 行找到.并在文件处理中添加 localsize = fl.tell() 相应地更新大小检查:

I manipulated the Paramiko-code of the get-function which can be found on line 785 in the "sftp_client.py" and added localsize = fl.tell() within the file-handling updating the size-checking accordingly:

with open(localpath, "wb") as fl:
    size = self.getfo(remotepath, fl, callback)
    localsize = fl.tell()
if localsize != size:
    raise IOError(
        "size mismatch  {} != {}".format(localsize, size)
    )

这应该避免某种有缺陷的本地文件大小检查 s = os.stat(localpath) 将其替换为正常工作的检查,该检查在文件处理期间使用文件对象来获取大小本地文件.

This should avoid the somehow flawed local file-size check s = os.stat(localpath) replacing it with a properly working one that uses the file-object during file-handling to get the size of the local file.

这篇关于“IOError:get 中的大小不匹配!";通过 SFTP 检索文件时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆