如何动态添加cassandra表列? [英] How to add cassandra table column dynamically?
问题描述
我正在尝试向cassandra表中动态添加新列。我正在使用以下版本-
I'm trying to add a new columns to cassandra table dynamically. I'm using below version -
cqlsh 5.0.1
我正在使用python与Cassandra进行交互。我有一个Python列表,希望将其作为列名添加到Cassandra表中。
I'm using python to interact with Cassandra. I have one python list which I wish to add as a column names to Cassandra table.
Python列表-
['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T']
当前,我正在迭代一个列表,然后将每一列一一添加到cassandra表中,如下所示-
Currently, I'm iterating a list and then adding each column one by one to cassandra table like below -
from cassandra.cluster import Cluster
cluster = Cluster(['localhost'])
session = cluster.connect()
session.execute("CREATE KEYSPACE IF NOT EXISTS data WITH replication = {'class':'SimpleStrategy', 'replication_factor' : 3};")
session.execute("use my_data")
session.execute("CREATE TABLE IF NOT EXISTS data.my_data (pk uuid PRIMARY KEY);")
names = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T']
for val in names:
try:
session.execute("alter table data.my_data add "+ val +" ascii;")
except:
pass
工作正常,但实际问题是,如果在我的python列表中有1000多个条目可用,则cassandra的点击次数应超过1000,这将非常耗时。
It is working fine but actual problem is, if in my python list more than 1000 entries are available then there should be more than 1000 hits to the cassandra which will be time consuming. Is any different approach available to add a column names to existing table in cassandra?
推荐答案
Cassandra在内部将数据存储为行,每一行是否可用任何其他方法将列名添加到cassandra中的现有表中?有一个键(分区键)和动态列数(集群键)。因此,您可以为列名使用Clustering Key值,例如
Cassandra internally stores data as rows, each row has a key (Partition key) and dynamic number of columns (clustering key). So, you can use Clustering Key value for your column names, e.g
CREATE TABLE my_data (
pk text,
column text,
value text,
PRIMARY KEY (pk, column)
);
通过常规的INSERT查询插入新列和值:
Insert new columns and values by a regular INSERT query:
INSERT INTO my_data (pk, column, value) VALUES ('pk1', 'A', 'value A');
INSERT INTO my_data (pk, column, value) VALUES ('pk1', 'B', 'value B');
INSERT INTO my_data (pk, column, value) VALUES ('pk1', 'C', 'value C');
...
获取pk1的所有列:
SELECT * FROM my_data WHERE pk='pk1';
已更新
假设,您有如上所述的表 my_data
和
您想要为特定的 pk $添加一些列和数据c $ c>值。
在python代码中执行插入查询:
Assume, you have table my_data
as described above and
you want to add some columns and data for a specific pk
value.
In python code perform insert query:
pk = 'pk'
columns_data = {'A':'value for A','B':'value for B','C': 'value for C'} #dynamic column data
for col_name, col_value in columns_data.iteritems():
try:
session.execute("INSERT INTO my_data (pk, column, value) VALUES (%s, %s, %s)", (pk, col_name, col_value))
except:
pass
此外,您可以使用异步驱动程序的方法,以实现更高的插入性能。
Moreover, you can use asynchronous driver's methods, to achieve more performance of inserting.
这篇关于如何动态添加cassandra表列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!