从SFTP文件读取CSV/Excel文件,使用Pandas在这些文件中进行一些更改,然后保存回来 [英] Read CSV/Excel files from SFTP file, make some changes in those files using Pandas, and save back

查看:688
本文介绍了从SFTP文件读取CSV/Excel文件,使用Pandas在这些文件中进行一些更改,然后保存回来的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在安全的SFTP文件夹中读取一些CSV/Excel文件,在这些文件中进行一些更改(每个文件中的固定更改,例如删除第2列),将它们上传到Postgre DB并将它们上传到其他Python中的SFTP路径

I want to read some CSV/Excel files on a secure SFTP folder, make some changes (fixed changes in each file like remove column 2) in those files, upload them to a Postgre DB and also the upload them to a different SFTP path in Python

最好的方法是什么?

我已经使用pysftp库建立了与SFTP的连接,并且正在读取Excel:

I have made a connection to the SFTP using pysftp library and am reading the Excel:

import pysftp
import pandas as pd

myHostname = "*****"
myUsername = "****"
myPassword = "***8"
cnopts =pysftp.CnOpts()
cnopts.hostkeys = None  

sftp=pysftp.Connection(host=myHostname, username=myUsername, 
password=myPassword,cnopts=cnopts)
print ("Connection succesfully stablished ... ")
sftp.chdir('test/test')
#sftp.pwd
a=[]
for i in sftp.listdir_attr():
    with sftp.open(i.filename) as f:
        df=pd.read_csv(f)

我应该如何继续上传到数据库并将这些更改永久保存为CSV?

How should I proceed with the upload to DB and making those changes to the CSV permanent?

推荐答案

您已完成下载部分.

有关上传部分,请参见如何使用Python中的Paramiko库在SFTP上将Pandas DataFrame传输到.csv? –用于Paramiko时, pysftp Connection.open方法行为与 Paramiko SFTPClient.open ,因此代码是相同的.

For the upload part, see How to Transfer Pandas DataFrame to .csv on SFTP using Paramiko Library in Python? – While it's for Paramiko, pysftp Connection.open method behaves identically to Paramiko SFTPClient.open, so the code is the same.

完整代码如下:

with sftp.open("/remote/path/data.csv", "r+", bufsize=32768) as f:
    # Download CSV contents from SFTP to memory
    df = pd.read_csv(f)

    # Modify as you need (just an example)
    df.at[0, 'Name'] = 'changed'

    # Upload the in-memory data back to SFTP
    f.seek(0)
    df.to_csv(f, index=False)
    # Truncate the remote file in case the new version of the contents is smaller
    f.truncate(f.tell())

上面更新了相同的文件.如果您要上传到其他文件,请使用以下方法:

The above updates the same file. If you want to upload to a different file, use this:

# Download CSV contents from SFTP to memory
with sftp.open("/remote/path/source.csv", "r") as f:
    df = pd.read_csv(f)

# Modify as you need (just an example)
df.at[0, 'Name'] = 'changed'

# Upload the in-memory data back to SFTP
with sftp.open("/remote/path/target.csv", "w", bufsize=32768) as f:
    df.to_csv(f, index=False)


出于bufsize的目的,请参阅:
写入使用pysftp"open"打开的SFTP服务器上的文件方法很慢


For the purpose of bufsize, see:
Writing to a file on SFTP server opened using pysftp "open" method is slow

强制性警告:除非您不关心安全性,否则请不要设置cnopts.hostkeys = None.有关正确的解决方案,请参见使用pysftp验证主机密钥.

Obligatory warning: Do not set cnopts.hostkeys = None, unless you do not care about security. For the correct solution see Verify host key with pysftp.

这篇关于从SFTP文件读取CSV/Excel文件,使用Pandas在这些文件中进行一些更改,然后保存回来的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆