从PCollection< TableRow>转换为到PCollection KV,K,V. [英] Convert from PCollection<TableRow> to PCollection<KV<K,V>>

查看:125
本文介绍了从PCollection< TableRow>转换为到PCollection KV,K,V.的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从BigQuery的2个表中提取数据,然后通过CoGroupByKey将其加入. 尽管BigQuery的输出为PCollection<TableRow>,但CoGroupByKey需要PCollection<KV<K,V>>. 如何从PCollection<TableRow>转换为PCollection<KV<K,V>>?

I'm trying to extract data from 2 tables in BigQuery, then join it by CoGroupByKey. Although the output of BigQuery is PCollection<TableRow>, CoGroupByKey requires PCollection<KV<K,V>>. How can I convert from PCollection<TableRow> to PCollection<KV<K,V>>?

推荐答案

CoGroupByKey需要通过以下方式知道CoGroup的哪个键-这是KV<K, V>中的K,而V是与此集合中与此键相关联的值.将几个集合共同分组的结果将为您为每个键提供每个集合中与此键的所有值.

CoGroupByKey needs to know which key to CoGroup by - this is the K in KV<K, V>, and the V is the value associated with this key in this collection. The result of co-grouping several collections will give you, for each key, all of the values with this key in each collection.

因此,您需要将两个PCollection<TableRow>都转换为PCollection<KV<YourKey, TableRow>>,其中YourKey是您想要加入它们的键的类型,例如在您的情况下,可能是StringInteger或其他内容.

So, you need to convert both of your PCollection<TableRow> to PCollection<KV<YourKey, TableRow>> where YourKey is the type of key on which you want to join them, e.g. in your case perhaps it might be String, or Integer, or something else.

进行转换的最佳转换可能是WithKeys.例如.这是一个代码示例,该示例将PCollection<TableRow>转换为以String类型的假设userId字段为键的PCollection<KV<String, TableRow>>:

The best transform to do the conversion is probably WithKeys. E.g. here's a code sample converting a PCollection<TableRow> to a PCollection<KV<String, TableRow>> keyed by a hypothetical userId field of type String:

PCollection<TableRow> rows = ...;
PCollection<KV<String, TableRow>> rowsKeyedByUser = rows
    .apply(WithKeys.of(new SerializableFunction<TableRow, String>() {
  @Override
  public String apply(TableRow row) {
    return (String)row.get("userId");
  }
}));

这篇关于从PCollection&lt; TableRow&gt;转换为到PCollection KV,K,V.的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆