在 Spark sc.newAPIHadoopRDD 中读取 2.7 GB 数据，有 5 个分区 [英] In Spark sc.newAPIHadoopRDD is reading 2.7 GB data the with 5 partitions

查看：30 发布时间：2021/11/14 22:52:21 apache-spark hbase apache-spark-sql

本文介绍了在 Spark sc.newAPIHadoopRDD 中读取 2.7 GB 数据，有 5 个分区的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用 spark 1.4，我正在尝试使用 sc.newAPIHadoopRDD 从 Hbase 读取数据以读取 2.7 GB 数据，但为此阶段创建了 5 个任务，处理它需要 2 t0 3 分钟.谁能告诉我如何增加更多分区以快速读取数据?

I am using spark 1.4 and I am trying to read the data from Hbase by using sc.newAPIHadoopRDD to read 2.7 GB data but there are 5 task are created for this stage and taking 2 t0 3 minutes to process it. Can anyone let me know how to increase the more partitions to read the data fast ?

在 Spark sc.newAPIHadoopRDD 中读取 2.7 GB 数据，有 5 个分区 [英] In Spark sc.newAPIHadoopRDD is reading 2.7 GB data the with 5 partitions

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在 Spark sc.newAPIHadoopRDD 中读取 2.7 GB 数据，有 5 个分区 [英] In Spark sc.newAPIHadoopRDD is reading 2.7 GB data the with 5 partitions

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭