在实践中，迷你批处理与实时流之间有什么区别(不是理论上的区别)? [英] What is the difference between mini-batch vs real time streaming in practice (not theory)?

查看：188 发布时间：2020/7/10 1:48:58 apache-spark batch-processing apache-flink data-processing stream-processing

本文介绍了在实践中，迷你批处理与实时流之间有什么区别(不是理论上的区别)?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在实践中，迷你批处理与实时流之间有什么区别(不是理论上的区别)?从理论上讲，我了解到迷你批处理是在给定的时间范围内进行批处理，而实时流更像是在数据到达时执行某些操作，但是我最大的问题是为什么不使用带有epsilon时间帧(例如一毫秒)的迷你批处理?想了解为什么一个解决方案比其他解决方案有效的原因?

What is the difference between mini-batch vs real time streaming in practice (not theory)? In theory, I understand mini batch is something that batches in the given time frame whereas real time streaming is more like do something as the data arrives but my biggest question is why not have mini batch with epsilon time frame (say one millisecond) or I would like to understand reason why one would be an effective solution than other?

我最近遇到了一个例子，其中迷你批处理(Apache Spark)用于欺诈检测，而实时流处理(Apache Flink)用于欺诈预防.有人还评论说，迷你批处理不是防止欺诈的有效解决方案(因为目标是防止交易在发生时发生)现在，我想知道为什么迷你批处理(Spark)不会那么有效? 为什么以1毫秒的延迟运行微型批处理不是有效的方法?批处理是一种在所有地方都使用的技术，包括OS和内核TCP/IP堆栈，这些数据确实缓冲了磁盘或网络上的数据，因此这里说一个比另一个更有效的说服力是什么?

I recently came across one example where mini-batch (Apache Spark) is used for Fraud detection and real time streaming (Apache Flink) used for Fraud Prevention. Someone also commented saying mini-batches would not be an effective solution for fraud prevention (since the goal is to prevent the transaction from occurring as it happened) Now I wonder why this wouldn't be so effective with mini batch (Spark) ? Why is it not effective to run mini-batch with 1 millisecond latency? Batching is a technique used everywhere including the OS and the Kernel TCP/IP stack where the data to the disk or network are indeed buffered so what is the convincing factor here to say one is more effective than other?

在实践中，迷你批处理与实时流之间有什么区别(不是理论上的区别)? [英] What is the difference between mini-batch vs real time streaming in practice (not theory)?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在实践中，迷你批处理与实时流之间有什么区别(不是理论上的区别)? [英] What is the difference between mini-batch vs real time streaming in practice (not theory)?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭