编写 CSV 以存储在 Google Cloud Storage 中 [英] Write a CSV to store in Google Cloud Storage

查看:26
本文介绍了编写 CSV 以存储在 Google Cloud Storage 中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景:我正在我的 Python/AppEngine 项目中获取数据并创建一个 .tsv 文件,以便我可以使用 d3.js 创建图表.现在我正在为每个页面加载编写 CSV;我想将文件存储在 Google Cloud Storage 中一次,然后从那里读取.

Background: I'm taking data in my Python/AppEngine project and creating a .tsv file so that I can create charts with d3.js. Right now I'm writing the CSV with each page load; I want to instead store the file once in Google Cloud Storage and read it from there.

每次加载页面时我当前如何编写文件!:

How I'm currently writing the file, each time the page is loaded!:

def get(self):  ## this gets called when loading myfile.tsv from d3.js
    datalist = MyEntity.all()
    self.response.headers['Content-Type'] = 'text/csv'
    writer = csv.writer(self.response.out, delimiter='	')
    writer.writerow(['field1', 'field2'])
    for eachco in datalist:
        writer.writerow([eachco.variable1, eachco.variable2])

虽然效率低下,但效果很好.

And while inefficient, this is working just fine.

使用此 Google Cloud Storage 文档,我一直在努力让这样的事情工作:

Using this Google Cloud Storage documentation, I've been trying to get something like this working:

def get(self):
    filename = '/bucket/myfile.tsv'
    datalist = MyEntity.all()
    bucket_name = os.environ.get('BUCKET_NAME', app_identity.get_default_gcs_bucket_name())
    write_retry_params = gcs.RetryParams(backoff_factor=1.1)
    writer = csv.writer(self.response.out, delimiter='	')
    gcs_file = gcs.open(filename, 'w', content_type='text/csv', retry_params=write_retry_params)
    gcs_file.write(writer.writerow(['field1', 'field2']))
    for eachco in datalist:
        gcs_file.write(writer.writerow([eachco.variable1, eachco.variable2]))
    gcs_file.close()

但我得到:

TypeError: Expected str but got <type 'NoneType'>.

我认为 csv.writer 的输出将是一个字符串,所以我不确定为什么我会收到 TypeError.

I thought that the output of csv.writer would be a string, so I'm not sure why I'm getting the TypeError.

所以我可以想到两种情况:

So I can think of two situations:

  1. 我的代码有问题,将 tsv 写入云储存.不过,迭代并将 TSV/CSV 文件写入 Cloud Storage 应该很简单,对吗?
  2. 我以完全错误的方式解决了这个问题完全,甚至应该使用 BlobStore 或 db.TextProperty()存储此 .tsv 数据.(文件没有那么大;当然远低于 1MB)

非常感谢您的帮助!

编辑 - 完整回溯

Traceback (most recent call last):
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 1530, in __call__
    rv = self.router.dispatch(request, response)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 1278, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 1102, in __call__
    return handler.dispatch()
  File "/mydirectory/myapp/handlers.py", line 21, in dispatch
    webapp2.RequestHandler.dispatch(self)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 572, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.1/webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)
  File "/mydirectory/myapp/thisapp.py", line 384, in get
    gcs_file.write(writer.writerow(['field1', 'field2']))
  File "lib/cloudstorage/storage_api.py", line 754, in write
    raise TypeError('Expected str but got %s.' % type(data))
TypeError: Expected str but got <type 'NoneType'>.

推荐答案

您仍在尝试为响应创建作者:

You're still attempting to create the writer on a response:

writer = csv.writer(self.response.out, delimiter='	')

您需要写入 GCS 文件.像这样:

You need to write to the GCS file. Something like this:

    datalist = MyEntity.all()
    bucket_name = os.environ.get('BUCKET_NAME', app_identity.get_default_gcs_bucket_name())
    filename = os.path.join(bucket_name, 'myfile.tsv')
    write_retry_params = gcs.RetryParams(backoff_factor=1.1)
    gcs_file = gcs.open(filename, 'w', content_type='text/csv', retry_params=write_retry_params)
    writer = csv.writer(gcs_file, delimiter='	')
    writer.writerow(['field1', 'field2'])
    for eachco in datalist:
        writer.writerow([eachco.variable1, eachco.variable2])
    gcs_file.close()

注意事项:

  • 未实际测试
  • 我还调整了文件名以使用 bucket_name
  • 如果您在 get() 请求中执行此操作,您可能需要检查文件是否已经存在,如果存在,请使用它,否则您仍会在每次请求时生成它.或者,您可以在任务或 .tsv 上传处理程序中移动此代码.
  • not actually tested
  • I also adjusted the filename to use bucket_name
  • if you do this in the get() request you may want to check if the file already exists and, if so, use it, otherwise you'd be still generating it at every request. Alternatively you could move this code on a task or in the .tsv upload handler.

这篇关于编写 CSV 以存储在 Google Cloud Storage 中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆