将具有多值(集合)属性的CSV导入到Cassandra [英] Import CSV with multi-valued (collection) attributes to Cassandra
问题描述
假设我要将csv文件导入到下表中:
Suppose I would like to import a csv file into the following table:
CREATE TABLE example_table (
id int PRIMARY KEY,
comma_delimited_str_list list<ascii>,
space_delimited_str_list list<ascii>
);
其中 comma_delimited_str_list
和 space_delimited_str_list
是两个列表属性,它们分别使用逗号和空格作为其分隔符。
where comma_delimited_str_list
and space_delimited_str_list
are two list-attributes which use comma and space as their delimiter respectively.
csv记录示例如下:
An example csv record would be:
12345,"hello,world","stack overflow"
其中我想要处理hello,world
和stack overflow
作为两个多值属性。
where I would like to treat "hello,world"
and "stack overflow"
as two multi-valued attributes.
我可以知道如何将这样的CSV文件导入到Cassandra的相应表中?最好使用CQL COPY?
Can I know how to import such CSV file into its corresponding table in Cassandra? Preferably using CQL COPY?
推荐答案
CQL 1.2能够将带有多值字段的CSV文件直接连接到表。但是,这些多值字段的格式必须与CQL格式匹配。
CQL 1.2 is able to port CSV file with multi-valued fields directly to a table. However, the format of those multi-valued fields must match the CQL format.
例如,列表必须是 ['abc' ,'def','ghi']
,并且集必须为 {'123','456','789'}
。
For example, lists must be in the form ['abc','def','ghi']
, and sets must be in the form {'123','456','789'}
.
以下是将CSV格式的数据导入STDIN中OP中提到的 example_table
的示例:
Below is an example of importing CSV formatted data to the example_table
mentioned in the OP from STDIN:
cqlsh:demo> copy example_table from STDIN;
[Use \. on a line by itself to end input]
[copy] 12345,"['hello','world']","['stack','overflow']"
[copy] 56780,"['this','is','a','test','list']","['here','is','another','one']"
[copy] \.
2 rows imported in 11.304 seconds.
cqlsh:demo> select * from example_table;
id | comma_delimited_str_list | space_delimited_str_list
-------+---------------------------+--------------------------
12345 | [hello, world] | [stack, overflow]
56780 | [this, is, a, test, list] | [here, is, another, one]
从CSV档案汇入不正确的格式化清单或设定值引发错误:
Importing incorrect formatted list or set values from a CSV file will raise an error:
cqlsh:demo> copy example_table from STDIN;
[Use \. on a line by itself to end input]
[copy] 9999,"hello","world"
Bad Request: line 1:108 no viable alternative at input ','
Aborting import at record #0 (line 1). Previously-inserted values still present.
上面的输入应该被替换为 9999,['hello'] ,['world']
:
The above input should be replaced by 9999,"['hello']","['world']"
:
cqlsh:demo> copy example_table from STDIN;
[Use \. on a line by itself to end input]
[copy] 9999,"['hello']","['world']"
[copy] \.
1 rows imported in 16.859 seconds.
cqlsh:demo> select * from example_table;
id | comma_delimited_str_list | space_delimited_str_list
-------+---------------------------+--------------------------
9999 | [hello] | [world]
12345 | [hello, world] | [stack, overflow]
56780 | [this, is, a, test, list] | [here, is, another, one]
这篇关于将具有多值(集合)属性的CSV导入到Cassandra的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!