未绑定表如何在Spark结构化流中工作 [英] How does unbound table work in spark structured streaming

查看：72 发布时间：2020/9/4 4:46:42 apache-spark spark-streaming

本文介绍了未绑定表如何在Spark结构化流中工作的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

例如，以单词计数为例，当应用程序启动并长时间运行时，接收到单词"Spark"，则结果表中将有一行(Spark，1)，

Take word count for example, when the application startup and long runs, and receive a word "Spark", then in the result table, there is a row (Spark,1),

应用程序运行1天甚至一个星期后，应用程序再次收到"Spark"，因此结果表应具有一行(spark，2).

After the application has been running for 1 day or even one week, the application receives "Spark" again, so that the result table should have a row (spark,2).

我只是在上面的场景中提出一个问题:无边界表如何保持接收到的数据的状态，因为在应用程序长时间运行之后，状态可能会非常巨大.

I am just using above scenario to raise the question: How the unbounded table keeps the state of the data it receives,since the state could be super huge after the application runs for a long time.

此外，在使用"Complete"输出模式时，如果结果表非常大，那么将结果表中的所有数据写到接收器上将非常耗时

Also, when using "Complete" output mode, if the resulting table is very large, then write out all the data in resulting table to sink will be very time expensive

未绑定表如何在Spark结构化流中工作 [英] How does unbound table work in spark structured streaming

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

未绑定表如何在Spark结构化流中工作 [英] How does unbound table work in spark structured streaming

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭