如何将Array [String]存储到输出文件 [英] How to store an Array[String] to an output file
问题描述
我有一个名为samparr的Array [String],其中包含一些值,我希望将其存储为输出文件.
I'm having an Array[String] called samparr with some values in it, I want it to get stored as an output file.
var samparr: Array[String] = new Array[String](4)
samparr +:= print1 + " BEST_MATCH " + print2
就像
val output = samparr.saveAsTextFile(outputpath)
但不是RDD,它不是Array [String]
but isn't a RDD its an Array[String]
推荐答案
您可以使用 SparkContext.parallelize
将Array分配"到Spark集群上(换句话说,将其转变为Spark集群).RDD),然后调用 saveAsTextFile
:
You can use SparkContext.parallelize
to "distribute" your Array onto the Spark cluster (in other words, to turn it into an RDD), and then call saveAsTextFile
:
sc.parallelize(samparr).saveAsTextFile(outputpath)
此操作将对数据进行分区,并将每个分区发送给执行程序之一,然后将每个分区保存到单独的文件部分".
This action will partition the data and send each partition to one of the executors, then each partition will be saved into a separate "file-part".
或者,由于数组很小,使用Spark并不能真正合理化",因此您可以尝试将数据保存到文件的任何非Spark方法,例如通过 @ avihoo-mamka 链接的链接:
Alternatively, since the array is very small and doesn't really "justify" using Spark, you can try any non-Spark method of saving data to file, e.g. the one linked by @avihoo-mamka: How to write to a file in Scala?
这篇关于如何将Array [String]存储到输出文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!