在文本文件中写入/存储数据框 [英] Write/store dataframe in text file
问题描述
我正在尝试将dataframe
写入text
文件.如果文件包含单列,则可以在文本文件中写入.如果文件包含多列,那么我将面临一些错误
I am trying to write dataframe
to text
file. If a file contains single column then I am able to write in text file. If file contains multiple column then I a facing some error
文本数据源仅支持一列,并且您有2个 列.
Text data source supports only a single column, and you have 2 columns.
object replace {
def main(args:Array[String]): Unit = {
Logger.getLogger("org").setLevel(Level.ERROR)
val spark = SparkSession.builder.master("local[1]").appName("Decimal Field Validation").getOrCreate()
var sourcefile = spark.read.option("header","true").text("C:/Users/phadpa01/Desktop/inputfiles/decimalvalues.txt")
val rowRDD = sourcefile.rdd.zipWithIndex().map(indexedRow => Row.fromSeq((indexedRow._2.toLong+1) +: indexedRow._1.toSeq)) //adding prgrefnbr
//add column for prgrefnbr in schema
val newstructure = StructType(Array(StructField("PRGREFNBR",LongType)).++(sourcefile.schema.fields))
//create new dataframe containing prgrefnbr
sourcefile = spark.createDataFrame(rowRDD, newstructure)
val op= sourcefile.write.mode("overwrite").format("text").save("C:/Users/phadpa01/Desktop/op")
}
}
推荐答案
您可以将数据帧转换为rdd并将隐蔽的行转换为字符串,并将最后一行写为
you can convert the dataframe to rdd and covert the row to string and write the last line as
val op= sourcefile.rdd.map(_.toString()).saveAsTextFile("C:/Users/phadpa01/Desktop/op")
已编辑
正如@philantrovert和@Pravinkumar指出的那样,以上内容将在输出文件中附加[
和]
,这是正确的.解决的办法是用empty
字符将replace
用作
As @philantrovert and @Pravinkumar have pointed that the above would append [
and ]
in the output file, which is true. The solution would be to replace
them with empty
character as
val op= sourcefile.rdd.map(_.toString().replace("[","").replace("]", "")).saveAsTextFile("C:/Users/phadpa01/Desktop/op")
一个人甚至可以使用regex
这篇关于在文本文件中写入/存储数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!