将Spark Structure Streaming数据写入Cassandra [英] Writing Spark Structure Streaming data into Cassandra
问题描述
我想使用Pyspark API将结构流数据写入Cassandra.
I want to write Structure Streaming Data into Cassandra using Pyspark API.
我的数据流如下:
Nifi-> Kafka-> Spark Structure Streaming-> Cassandra
我尝试过以下方式:
query = df.writeStream\
.format("org.apache.spark.sql.cassandra")\
.option("keyspace", "demo")\
.option("table", "test")\
.start()
但是出现以下错误消息: "org.apache.spark.sql.cassandra"不支持流式写入.
But getting below error message: "org.apache.spark.sql.cassandra" does not support streaming write.
Also another approach I have tried: [Source - DSE 6.0 Administrator Guide]
query = df.writeStream\
.cassandraFormat("test", "demo")\
.start()
但是出现异常:AttributeError:'DataStreamWriter'对象没有属性'cassandraFormat'
But got exception: AttributeError: 'DataStreamWriter' object has no attribute 'cassandraFormat'
任何人都可以给我一些想法,让我进一步吗?
Can anyone give me some idea how I can proceed further ?
谢谢.
推荐答案
升级DSE 6.0(最新版本)后,我能够将结构化的流数据写入Cassandra. [Spark 2.2&卡桑德拉3.11]
After upgrading DSE 6.0 (latest version) I am able to write structured streaming data into Cassandra. [Spark 2.2 & Cassandra 3.11]
参考代码:
query = fileStreamDf.writeStream\
.option("checkpointLocation", '/tmp/check_point/')\
.format("org.apache.spark.sql.cassandra")\
.option("keyspace", "analytics")\
.option("table", "test")\
.start()
DSE文档URL: https://docs.datastax.com/en/dse/6.0/dse-dev/datastax_enterprise/spark/structuredStreaming.html
DSE documentation URL: https://docs.datastax.com/en/dse/6.0/dse-dev/datastax_enterprise/spark/structuredStreaming.html
这篇关于将Spark Structure Streaming数据写入Cassandra的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!