Google Cloud Dataflow(Python):读取和写入.csv文件的功能? [英] Google Cloud Dataflow (Python): function to read from and write to a .csv file?

查看:150
本文介绍了Google Cloud Dataflow(Python):读取和写入.csv文件的功能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法弄清楚GCP Dataflow Python SDK中读取和写入csv文件(或任何非txt文件)的精确函数。对于BigQuery,我已经计算出以下函数:

beam.io.Read(beam.io.BigQuerySource('%Table_ID%'))
beam .io.Write(beam.io.BigQuerySink('%Table_ID%'))

为读取文本文件,ReadFromText和WriteToText函数对我来说是已知的。



但是,我无法找到任何用于写入或读取csv文件的GCP Dataflow Python SDK的示例。请提供GCP Dataflow Python SDK函数来读取和写入csv文件,方法与我上面有关BigQuery的函数所做的相同。 CsvFileSource 中有一个 beam_utils 读取.csv文件的PiPy包,处理文件头,并可以设置自定义分隔符。有关如何在此答案中使用此来源的更多信息。希望有帮助!


I am not able to figure out the precise functions in GCP Dataflow Python SDK that read from and write to csv files (or any non-txt files for that matter). For BigQuery, I have figured out the following functions:

beam.io.Read(beam.io.BigQuerySource('%Table_ID%')) beam.io.Write(beam.io.BigQuerySink('%Table_ID%'))

For reading textfiles, the ReadFromText and WriteToText functions are known to me.

However, I am not able to find any examples for GCP Dataflow Python SDK in which data is written to or read from csv files. Please could you provide the GCP Dataflow Python SDK functions for reading from and writing to csv files in the same manner as I have done for the functions relating to BigQuery above?

解决方案

There is a CsvFileSource in the beam_utils PiPy package that reads .csv files, deals with file headers, and can set custom delimiters. More information on how to use this source in this answer. Hope that helps!

这篇关于Google Cloud Dataflow(Python):读取和写入.csv文件的功能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆