Spark Scala流式CSV [英] Spark Scala Streaming CSV
本文介绍了Spark Scala流式CSV的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我是新的在Spark / Scala。我知道如何加载CSV文件:
I am new in Spark/Scala. I know how to load CSV files:
sqlContext.read.format("csv")
以及如何读取文本流和文件流:
and how to read text streams and file streams:
scc.textFileStream("""file:///c:\path\filename""");
scc.fileStream[LongWritable, Text, TextInputFormat](...)
要阅读文本 CSV格式的流格式?谢谢,Levi
but how to read text stream in CSV format? Thanks, Levi
推荐答案
这里:
val ssc = new StreamingContext(sparkConf, Seconds(5))
// Create the FileInputDStream on the directory
val lines = ssc.textFileStream("file:///C:/foo/bar")
lines.foreachRDD(rdd => {
if (!rdd.isEmpty()) {
println("RDD row count: " + rdd.count())
// Now you can convert this RDD to DataFrame/DataSet and perform business logic.
}
}
})
ssc.start()
ssc.awaitTermination()
}
这篇关于Spark Scala流式CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文