有没有比收集更好的方法来读取Spark中的RDD了? [英] Is there any better method than collect to read an RDD in spark?

查看：77 发布时间：2020/9/20 19:52:22 java serialization apache-spark bigdata

本文介绍了有没有比收集更好的方法来读取Spark中的RDD了?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

因此，我想读取RDD并将其放入一个数组中.为此，我可以使用 collect 方法.但是该方法确实很烦人，因为在我的情况下，它一直在给出kyro缓冲区溢出错误.如果我过多地设置了kyro缓冲区大小，它将开始出现自己的问题.另一方面，我注意到，如果仅使用 saveAsTextFile 方法将RDD保存到文件中，则不会出错.因此，我在想，必须有一些更好的方法来将RDD读入数组，这没有 collect 方法那样麻烦.

So, I want to read and RDD into an array. For that purpose, I could use the collect method. But that method is really annoying as in my case it keeps on giving kyro buffer overflow errors. If I set the kyro buffer size too much, it starts to have its own problems. On the other hand, I have noticed that if I just save the RDD into a file using the saveAsTextFile method, I get no errors. So, I was thinking, there must be some better method of reading an RDD into an array which isn't as problematic as the collect method.

有没有比收集更好的方法来读取Spark中的RDD了? [英] Is there any better method than collect to read an RDD in spark?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

有没有比收集更好的方法来读取Spark中的RDD了? [英] Is there any better method than collect to read an RDD in spark?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭