如何为 apache 光束数据流的输出 csv 添加标头? [英] How do I add headers for the output csv for apache beam dataflow?
问题描述
我注意到在 java sdk 中,有一个函数可以让您编写 csv 文件的标题.https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/io/TextIO.Write.html#withHeader-java.lang.String-
此功能是否反映在 python skd 上?
您现在可以写入文本并使用文本接收器指定标题.
来自文档:
class apache_beam.io.textio.WriteToText(file_path_prefix, file_name_suffix='', append_trailing_newlines=True, num_shards=0, shard_name_template=None, coder=ToStringCoder, compression_type='auto', header=None)
因此您可以使用以下代码:
beam.io.WriteToText(bucket_name, file_name_suffix='.csv', header='colname1, colname2')
如果您想了解详细信息或查看其实现方式,请在此处获得完整的文档:https://beam.apache.org/documentation/sdks/pydoc/2.0.0/_modules/apache_beam/io/textio.html#WriteToText>
I noticed in the java sdk, there is a function that allows you to write the headers of a csv file. https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/io/TextIO.Write.html#withHeader-java.lang.String-
Is this features mirrored on the python skd?
You can now write to a text and specify a header using the text sink.
From the documentation:
class apache_beam.io.textio.WriteToText(file_path_prefix, file_name_suffix='', append_trailing_newlines=True, num_shards=0, shard_name_template=None, coder=ToStringCoder, compression_type='auto', header=None)
So you can use the following piece of code:
beam.io.WriteToText(bucket_name, file_name_suffix='.csv', header='colname1, colname2')
The complete documentation is available here if you want details or check how it is implemented: https://beam.apache.org/documentation/sdks/pydoc/2.0.0/_modules/apache_beam/io/textio.html#WriteToText
这篇关于如何为 apache 光束数据流的输出 csv 添加标头?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!