将 Pandas 数据框保存到 Google Cloud 存储桶 [英] Save Pandas data frame to Google Cloud bucket
问题描述
我想将 Pandas 数据框直接保存到 Google Cloud Storage.我使用 write-a-pandas-dataframe 尝试了不同的方法-to-google-cloud-storage-or-bigquery.但我无法保存.
I want to save pandas data frame directly to Google Cloud Storage. I tried different ways using write-a-pandas-dataframe-to-google-cloud-storage-or-bigquery. But I am not able to save.
注意:我只能使用 google.cloud 包
下面是我试过的代码
from google.cloud import storage
import pandas as pd
input_dict = [{'Name': 'A', 'Id': 100}, {'Name': 'B', 'Id': 110}, {'Name': 'C', 'Id': 120}]
df = pd.DataFrame(input_dict)
尝试:1
destination = f'gs://bucket_name/test.csv'
df.to_csv(destination)
尝试:2
storage_client = storage.Client(project='project')
bucket = storage_client.get_bucket('bucket_name')
gs_file = bucket.blob('test.csv')
df.to_csv(gs_file)
我遇到以下错误
对于选项 1:没有这样的文件或目录:'gs://bucket_name/test.csv'
for option 1 : No such file or directory: 'gs://bucket_name/test.csv'
选项 2:'Blob' 对象没有属性 'close'
option 2: 'Blob' object has no attribute 'close'
谢谢,
拉古纳特.
推荐答案
from google.cloud import storage
import os
from io import StringIO # if going with no saving csv file
# say where your private key to google cloud exists
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/your-google-cloud-private-key.json'
df = pd.DataFrame([{'Name': 'A', 'Id': 100}, {'Name': 'B', 'Id': 110}])
先将它写入您机器上的 csv 文件并上传:
Write it to a csv file on your machine first and upload it:
df.to_csv('local_file.csv')
gcs.get_bucket('BUCKET_NAME').blob('FILE_NAME.csv').upload_from_filename('local_file.csv', content_type='text/csv')
如果您不想创建临时 csv 文件,请使用 StringIO:
If you do not want to create a temp csv file, use StringIO:
f = StringIO()
df.to_csv(f)
f.seek(0)
gcs.get_bucket('BUCKET_NAME').blob('FILE_NAME.csv').upload_from_file(f, content_type='text/csv')
这篇关于将 Pandas 数据框保存到 Google Cloud 存储桶的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!