使用新的键值对更新Cassandra中的地图类型列，而不完全覆盖地图 [英] Update Map type columns in Cassandra with new key value pairs and not completely overwrite the map

查看：50 发布时间：2020/9/29 21:10:54 scala apache-spark cassandra

本文介绍了使用新的键值对更新Cassandra中的地图类型列，而不完全覆盖地图的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在插入Spark数据集[（插入字符串，映射[字符串，字符串]）]到Cassandra Table 。

我有一个类型为Dataset [（String，Map [String，String]）]的Spark数据集。

I have a Spark Dataset of type Dataset[(String, Map[String, String])].

我必须将其插入到Cassandra表中。

I have to insert the same into a Cassandra table.

在这里，键入数据集[（字符串，Map [String，String]）]将成为我在Cassandra中该行的主键。

Here, key in the Dataset[(String, Map[String, String])] will become my primary key of the row in Cassandra.

数据集中的地图[（String，Map [String，String] ）]将进入ColumnNameValueMap列的同一行。

The Map in the Dataset[(String, Map[String, String])] will go in the same row in a column ColumnNameValueMap.

我的Cassandra表结构为：

My Cassandra table structure is:

CREATE TABLE SampleKeyspace.CassandraTable (
  RowKey text PRIMARY KEY,
  ColumnNameValueMap map<text,text>
);

我能够使用Spark Cassandra连接器将数据插入Cassandra表中。

I was able to insert the data in Cassandra table using the Spark Cassandra connector.

现在，我正在使用相同行键（第一列/主键）的新键值更新同一映射列（第二列）。但是，此列的每个新更新都会清除以前的地图。

如何使用Spark Cassandra连接器附加相同的地图？

How can I append the same map using Spark Cassandra connector?

推荐答案

我不认为可以直接从Dataframe API进行操作，但是我无法通过RDD API 。例如，我在下面的表格中包含了一些测试数据：

I don't think that it's possible to do it directly from Dataframe API, but it's possible to do via RDD API. For example, I have following tabble with some test data:

CREATE TABLE test.m1 (
    id int PRIMARY KEY,
    m map<int, text>
);
cqlsh> select * from test.m1;                                                                                                                                                                                                  id | m                                                                                                        ----+--------------------                                                                                        1 | {1: 't1', 2: 't2'}                                                                                                                                                                                                      (1 rows)

，我有数据在Spark中：

and I have data in Spark:

scala> val data = Seq((1, Map(3 -> "t3"))).toDF("id", "m")                                                     data: org.apache.spark.sql.DataFrame = [id: int, m: map<int,string>]

然后我可以指定将数据追加到特定列以下代码：

then I can specify that I want to append data to specific column wit following code:

data.rdd.saveToCassandra("test", "m1", SomeColumns("id", "m" append))

我可以看到数据已更新：

and I can see that data is updated:

cqlsh> select * from test.m1;
id | m
----+----------------------------- 
 1 | {1: 't1', 2: 't2', 3: 't3'}
(1 rows)

除了追加，还支持使用 remove 选项和<$ c $删除元素c>前置（仅适用于列表）。文档中包含有关示例。

Besides append, there is support for removing elements with remove option, and prepend (only for lists). Documentation contains examples on that.

这篇关于使用新的键值对更新Cassandra中的地图类型列，而不完全覆盖地图的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用新的键值对更新Cassandra中的地图类型列，而不完全覆盖地图 [英] Update Map type columns in Cassandra with new key value pairs and not completely overwrite the map

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用新的键值对更新Cassandra中的地图类型列，而不完全覆盖地图 [英] Update Map type columns in Cassandra with new key value pairs and not completely overwrite the map

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭