外部表未从火花流写入的镶木地板文件更新 [英] External Table not getting updated from parquet files written by spark streaming

查看：27 发布时间：2021/11/14 22:59:42 apache-spark hive apache-spark-sql parquet

本文介绍了外部表未从火花流写入的镶木地板文件更新的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用 Spark 流将聚合输出作为镶木地板文件写入使用 SaveMode.Append 的 hdfs.我创建了一个外部表，如:

I am using spark streaming to write the aggregated output as parquet files to the hdfs using SaveMode.Append. I have an external table created like :

CREATE TABLE if not exists rolluptable
USING org.apache.spark.sql.parquet
OPTIONS (
  path "hdfs:////"
);

我的印象是，如果是外部表，查询也应该从新添加的镶木地板文件中获取数据.但是，似乎新写入的文件没有被提取.

I had an impression that in case of external table the queries should fetch the data from newly parquet added files also. But, seems like the newly written files are not being picked up.

每次删除并重新创建表都可以正常工作，但不是解决方案.

Dropping and recreating the table every time works fine but not a solution.

请建议我的表如何也能包含来自较新文件的数据.

Please suggest how can my table have the data from newer files also.

外部表未从火花流写入的镶木地板文件更新 [英] External Table not getting updated from parquet files written by spark streaming

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

外部表未从火花流写入的镶木地板文件更新 [英] External Table not getting updated from parquet files written by spark streaming

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭