读取记录的嵌套列表时发生ClassCastException [英] ClassCastException when reading nested list of records
问题描述
我正在从Dataflow的BigQuery表中读取数据,其中的一个字段是记录"字段,和重复的"场地.因此,我希望Java中生成的数据类型为 List< TableRow>
.
I am reading in a BigQuery table from Dataflow where one of the fields is a "record" and "repeated" field. So I expected the resulting data type in Java to be List<TableRow>
.
但是,当我尝试遍历列表时,出现以下异常:
However when I try to iterate over the list I get the following exception:
java.lang.ClassCastException:无法将java.util.LinkedHashMap强制转换为com.google.api.services.bigquery.model.TableRow
java.lang.ClassCastException: java.util.LinkedHashMap cannot be cast to com.google.api.services.bigquery.model.TableRow
表架构看起来像这样:
{
"id": "my_id",
"values": [
{
"nested_record": "nested"
}
]
}
用于遍历值的代码如下所示:
The code to iterate over values looks something like this:
String id = (String) row.get("id");
List<TableRow> values = (List<TableRow>) row.get("values");
for (TableRow nested : values) {
// more logic
}
在循环开始的地方抛出异常.明显的解决方法是将值仅转换为 LinkedHashMaps
的列表,但这感觉不正确.
The exception is thrown right where the loop begins.
The obvious fix here is to just cast values as a List of LinkedHashMaps
but that doesn't feel right.
为什么数据流会为嵌套的记录"引发此类错误?
Why does Dataflow throw this kind of error for nested "records"?
推荐答案
看看 BEAM-2767
造成这种情况的根本原因是DirectRunner在步骤之间执行了编码往返,这通常不在Dataflow中执行.作为Table字段访问重复记录(或任何记录)将在这两个运行程序上成功执行,因为TableRow实现了Map接口.记录被读取为"TableRow"类型,但是在对它们进行编码时,它们被编码为简单的JSON映射.由于JSON编码器无法识别地图的字段类型,因此它将记录反序列化为简单的地图类型.
TableRow是一个Map,因此您可以将这两种情况都视为Map:
TableRow is a Map so you can treat both cases as Map:
String id = (String) row.get("id");
List<? extends Map> values = row.get("values");
for (Map nested : values) {
// more logic
}
这篇关于读取记录的嵌套列表时发生ClassCastException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!