如何为Apache Beam数据流的输出CSV添加标头? [英] How do I add headers for the output csv for apache beam dataflow?

查看:44
本文介绍了如何为Apache Beam数据流的输出CSV添加标头?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在java sdk中注意到,有一个函数可以让您编写csv文件的标头. https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/io/TextIO.Write.html#withHeader-java.lang.String-

此功能是否已在python skd上镜像?

解决方案

您现在可以写入文本并使用文本接收器指定标题.

来自文档:

  class apache_beam.io.textio.WriteToText(file_path_prefix,file_name_suffix ='',append_trailing_newlines = True,num_shards = 0,shard_name_template = None,coder = ToStringCoder,compression_type ='auto',header = None) 

因此您可以使用以下代码:

  beam.io.WriteToText(存储桶名称,文件名称后缀='.csv',标头='colname1,colname2') 

如果需要详细信息或检查其实现方式,请参见此处的完整文档:https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/io/TextIO.Write.html#withHeader-java.lang.String-

Is this features mirrored on the python skd?

解决方案

You can now write to a text and specify a header using the text sink.

From the documentation:

class apache_beam.io.textio.WriteToText(file_path_prefix, file_name_suffix='', append_trailing_newlines=True, num_shards=0, shard_name_template=None, coder=ToStringCoder, compression_type='auto', header=None)

So you can use the following piece of code:

beam.io.WriteToText(bucket_name, file_name_suffix='.csv', header='colname1, colname2')

The complete documentation is available here if you want details or check how it is implemented: https://beam.apache.org/documentation/sdks/pydoc/2.0.0/_modules/apache_beam/io/textio.html#WriteToText

这篇关于如何为Apache Beam数据流的输出CSV添加标头?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆