如何为Apache Beam数据流的输出CSV添加标头? [英] How do I add headers for the output csv for apache beam dataflow?
问题描述
我在java sdk中注意到,有一个函数可以让您编写csv文件的标头. https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/io/TextIO.Write.html#withHeader-java.lang.String-
此功能是否已在python skd上镜像?
您现在可以写入文本并使用文本接收器指定标题.
来自文档:
class apache_beam.io.textio.WriteToText(file_path_prefix,file_name_suffix ='',append_trailing_newlines = True,num_shards = 0,shard_name_template = None,coder = ToStringCoder,compression_type ='auto',header = None)
因此您可以使用以下代码:
beam.io.WriteToText(存储桶名称,文件名称后缀='.csv',标头='colname1,colname2')
如果需要详细信息或检查其实现方式,请参见此处的完整文档:https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/io/TextIO.Write.html#withHeader-java.lang.String-
Is this features mirrored on the python skd?
You can now write to a text and specify a header using the text sink.
From the documentation:
class apache_beam.io.textio.WriteToText(file_path_prefix, file_name_suffix='', append_trailing_newlines=True, num_shards=0, shard_name_template=None, coder=ToStringCoder, compression_type='auto', header=None)
So you can use the following piece of code:
beam.io.WriteToText(bucket_name, file_name_suffix='.csv', header='colname1, colname2')
The complete documentation is available here if you want details or check how it is implemented: https://beam.apache.org/documentation/sdks/pydoc/2.0.0/_modules/apache_beam/io/textio.html#WriteToText
这篇关于如何为Apache Beam数据流的输出CSV添加标头?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!