如何将Secure FTP Server中的CSV文件读取到Pandas DataFrame中 [英] How do I read a CSV from Secure FTP Server into a Pandas DataFrame
问题描述
我在一个安全的FTP服务器上有一组CSV文件,我试图在内存中读入(单独)Pandas DataFrame,以便我可以操纵它们,然后通过API将它们传递到别处。 FTP服务器需要身份验证,这意味着我无法使用其他非常有用的 pd.read_csv()
直接从服务器读取csv。
I have a set of CSV files on a secure FTP server that I'm trying to read into (separate) Pandas DataFrames in memory so that I can manipulate them and then pass them elsewhere via an API. The FTP server requires authentication, which means I'm not able to use the otherwise very useful pd.read_csv()
to read the csv straight from the server.
以下(Python 3.x)代码将连接,然后将文件写入磁盘:
The following (Python 3.x) code will connect and then write the file out to disk:
from ftplib import FTP
import pandas as pd
server = "server.ip"
username = "user"
password = "psswd"
file1 = "file1.csv" # Just one of the files; I'll eventually loop through...
ftp = FTP(server)
ftp.login(user=username, passwd=password)
with open(filename, "wb") as file:
ftp.retrbinary("RETR " + filename, file.write)
# Do some other logic not relevant to the question
我想避免将文件写入磁盘然后再读入。我知道 pd.read_csv()
将直接从公共地址读取csv文件,但是当文件在登录后被门控时,我看不到任何如何操作的示例。
I'd like to avoid writing the file to disk and then reading it back in. I know that pd.read_csv()
will read csv files straight from public addresses, but I can't see any examples of how to do so when the files are gated behind a login.
推荐答案
IIRC您可以使用urllib2执行已验证的FTP请求。可能类似于
IIRC you can perform authenticated FTP requests using urllib2. Perhaps something like
import urllib2, base64
import pandas as pd
req = urllib2.Request('ftp://example.com')
base64string = base64.encodestring('%s:%s' % (username, password)).replace('\n', '')
request.add_header("Authorization", "Basic %s" % base64string)
response = urllib2.urlopen(req)
data = pd.csv_read(response.read())
未经测试,但您可以找到更多信息 urllib2在这里。
Not tested but you can find more information urllib2 here.
这篇关于如何将Secure FTP Server中的CSV文件读取到Pandas DataFrame中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!