使用 pandas 从Google Cloud Storage读取CSV文件 [英] Reading CSV files from Google Cloud Storage using pandas

查看:118
本文介绍了使用 pandas 从Google Cloud Storage读取CSV文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正尝试将Google云端存储中的一堆CSV文件读入熊猫数据帧,如将Google云端存储中的csv读取到pandas数据帧中

I am trying to read a bunch of CSV files from Google Cloud Storage into pandas dataframes as explained in Read csv from Google Cloud storage to pandas dataframe

storage_client = storage.Client()

bucket = storage_client.bucket(bucket_name)
blobs = bucket.list_blobs(prefix=prefix)

list_temp_raw = []
for file in blobs:
    filename = file.name
    temp = pd.read_csv('gs://'+bucket_name+'/'+filename+'.csv', encoding='utf-8')
list_temp_raw.append(temp)

df = pd.concat(list_temp_raw)

在导入gcfs时显示以下错误消息。软件包 dask和 gcsfs已经安装在我的计算机上;但是,不能摆脱以下错误。

It shows the following error message while importing gcfs. The packages 'dask' and 'gcsfs' have already been installed on my machine; however, cannot get rid of the following error.

File "C:\Program Files\Anaconda3\lib\site-packages\gcsfs\dask_link.py", line 
121, in register
dask.bytes.core._filesystems['gcs'] = DaskGCSFileSystem
AttributeError: module 'dask.bytes.core' has no attribute '_filesystems'


推荐答案

gcsfs dask 软件包之间似乎有一些错误或冲突。实际上, dask 库对于您的代码来说是不需要的。您的代码要运行的最低配置是安装库(我正在发布其最新版本):

It seems there is some error or conflict between the gcsfs and dask packages. In fact, the dask library is not needed for your code to work. The minimal configuration for your code to run is to install the libraries ( I am posting its latest versions):

google-cloud-storage==1.14.0
gcsfs==0.2.1
pandas==0.24.1

此外,文件名已包含 .csv 扩展名。因此,将第9行更改为:

Also, the filename already contains the .csv extension. So change the 9th line to this:

temp = pd.read_csv('gs://'+ bucket_name +'/'+文件名, encoding ='utf-8')

通过此更改,我运行了您的代码,它起作用了。我建议您创建虚拟环境并安装库并在其中运行代码

With this changes I ran your code and it works. I suggest you to create a virtual env and install the libraries and run the code there:

这篇关于使用 pandas 从Google Cloud Storage读取CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆