从SFTP文件读取CSV/Excel文件,使用Pandas在这些文件中进行一些更改,然后保存回来 [英] Read CSV/Excel files from SFTP file, make some changes in those files using Pandas, and save back
问题描述
我想在安全的SFTP文件夹中读取一些CSV/Excel文件,在这些文件中进行一些更改(每个文件中的固定更改,例如删除第2列),将它们上传到Postgre DB并将它们上传到其他Python中的SFTP路径
I want to read some CSV/Excel files on a secure SFTP folder, make some changes (fixed changes in each file like remove column 2) in those files, upload them to a Postgre DB and also the upload them to a different SFTP path in Python
最好的方法是什么?
我已经使用pysftp库建立了与SFTP的连接,并且正在读取Excel:
I have made a connection to the SFTP using pysftp library and am reading the Excel:
import pysftp
import pandas as pd
myHostname = "*****"
myUsername = "****"
myPassword = "***8"
cnopts =pysftp.CnOpts()
cnopts.hostkeys = None
sftp=pysftp.Connection(host=myHostname, username=myUsername,
password=myPassword,cnopts=cnopts)
print ("Connection succesfully stablished ... ")
sftp.chdir('test/test')
#sftp.pwd
a=[]
for i in sftp.listdir_attr():
with sftp.open(i.filename) as f:
df=pd.read_csv(f)
我应该如何继续上传到数据库并将这些更改永久保存为CSV?
How should I proceed with the upload to DB and making those changes to the CSV permanent?
推荐答案
您已完成下载部分.
有关上传部分,请参见如何使用Python中的Paramiko库在SFTP上将Pandas DataFrame传输到.csv? –用于Paramiko时, pysftp Connection.open
方法行为与 Paramiko SFTPClient.open
,因此代码是相同的.
For the upload part, see How to Transfer Pandas DataFrame to .csv on SFTP using Paramiko Library in Python? – While it's for Paramiko, pysftp Connection.open
method behaves identically to Paramiko SFTPClient.open
, so the code is the same.
完整代码如下:
with sftp.open("/remote/path/data.csv", "r+", bufsize=32768) as f:
# Download CSV contents from SFTP to memory
df = pd.read_csv(f)
# Modify as you need (just an example)
df.at[0, 'Name'] = 'changed'
# Upload the in-memory data back to SFTP
f.seek(0)
df.to_csv(f, index=False)
# Truncate the remote file in case the new version of the contents is smaller
f.truncate(f.tell())
上面更新了相同的文件.如果您要上传到其他文件,请使用以下方法:
The above updates the same file. If you want to upload to a different file, use this:
# Download CSV contents from SFTP to memory
with sftp.open("/remote/path/source.csv", "r") as f:
df = pd.read_csv(f)
# Modify as you need (just an example)
df.at[0, 'Name'] = 'changed'
# Upload the in-memory data back to SFTP
with sftp.open("/remote/path/target.csv", "w", bufsize=32768) as f:
df.to_csv(f, index=False)
出于bufsize
的目的,请参阅:
写入使用pysftp"open"打开的SFTP服务器上的文件方法很慢
For the purpose of bufsize
, see:
Writing to a file on SFTP server opened using pysftp "open" method is slow
强制性警告:除非您不关心安全性,否则请不要设置cnopts.hostkeys = None
.有关正确的解决方案,请参见使用pysftp验证主机密钥.
Obligatory warning: Do not set cnopts.hostkeys = None
, unless you do not care about security. For the correct solution see Verify host key with pysftp.
这篇关于从SFTP文件读取CSV/Excel文件,使用Pandas在这些文件中进行一些更改,然后保存回来的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!