为什么Apache Orc RecordReader.searchArgument()无法正确过滤? [英] Why is Apache Orc RecordReader.searchArgument() not filtering correctly?

查看:259
本文介绍了为什么Apache Orc RecordReader.searchArgument()无法正确过滤?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个简单的程序,该程序:

Here is a simple program that:

  1. 将记录写入Orc文件
  2. 然后尝试使用谓词下推(searchArgument)
  3. 读取文件
  1. Writes records into an Orc file
  2. Then tries to read the file using predicate pushdown (searchArgument)

问题:

  1. 这是在兽人中使用谓词下推的正确方法吗?
  2. read(..)方法似乎返回所有记录,而完全忽略了searchArguments.为什么会这样?
  1. Is this the right way to use predicate push down in Orc?
  2. The read(..) method seems to return all the records, completely ignoring the searchArguments. Why is that?

注释:

我无法找到任何有用的单元测试来演示Orc中谓词下推的工作方式(火花

I have not been able to find any useful unit test that demonstrates how predicate pushdown works in Orc (Orc on GitHub). Nor am I able to find any clear documentation on this feature. Tried looking at Spark and Presto code, but I was not able to find anything useful.

下面的代码是

}

推荐答案

我遇到了相同的问题,并且我认为通过更改可以解决此问题

I encountered the same issue, and I think it was rectified by changing

.equals("x", Type.LONG,

.equals("x",PredicateLeaf.Type.LONG

使用此功能时,读者似乎只返回带有相关行的批处理,而不仅返回我们要求的一次.

On using this, the reader seems to return only the batch with the relevant rows, not only once which we asked for.

这篇关于为什么Apache Orc RecordReader.searchArgument()无法正确过滤?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆