DStream在一个批处理间隔内生成多少个RDD? [英] How many RDDs does DStream generate for a batch interval?

查看:313
本文介绍了DStream在一个批处理间隔内生成多少个RDD?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

无论数据量有多大,一批数据间隔在DStream中是否生成一个且只有一个 RDD?

Does one batch interval of data generate one and only one RDD in DStream regardless of how big is the quantity of the data?

推荐答案

是的,每个批处理间隔中只有一个RDD,在每个批处理间隔中生成,与记录数无关(RDD中包含这些记录-里面有零条记录.

Yes, there is exactly one RDD per batch interval, produced at every batch interval independent of number of records (that are included in the RDD -- there could be zero records inside).

如果没有,并且RDD的创建取决于元素的数量,那么您将没有同步(微批处理)流,而是异步处理的一种形式.

If there wasn't, and RDD creation was conditioned on the number of elements, you wouldn't have synchronous (micro-batching) streaming, but rather a form of asynchronous processing.

这篇关于DStream在一个批处理间隔内生成多少个RDD?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆