将具有多值(集合)属性的CSV导入到Cassandra [英] Import CSV with multi-valued (collection) attributes to Cassandra

查看:498
本文介绍了将具有多值(集合)属性的CSV导入到Cassandra的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我要将csv文件导入到下表中:

Suppose I would like to import a csv file into the following table:

CREATE TABLE example_table (
  id int PRIMARY KEY,
  comma_delimited_str_list list<ascii>,
  space_delimited_str_list list<ascii>
);

其中 comma_delimited_str_list space_delimited_str_list 是两个列表属性,它们分别使用逗号和空格作为其分隔符。

where comma_delimited_str_list and space_delimited_str_list are two list-attributes which use comma and space as their delimiter respectively.

csv记录示例如下:

An example csv record would be:

12345,"hello,world","stack overflow"

其中我想要处理hello,worldstack overflow作为两个多值属性。

where I would like to treat "hello,world" and "stack overflow" as two multi-valued attributes.

我可以知道如何将这样的CSV文件导入到Cassandra的相应表中?最好使用CQL COPY?

Can I know how to import such CSV file into its corresponding table in Cassandra? Preferably using CQL COPY?

推荐答案

CQL 1.2能够将带有多值字段的CSV文件直接连接到表。但是,这些多值字段的格式必须与CQL格式匹配。

CQL 1.2 is able to port CSV file with multi-valued fields directly to a table. However, the format of those multi-valued fields must match the CQL format.

例如,列表必须是 ['abc' ,'def','ghi'] ,并且集必须为 {'123','456','789'}

For example, lists must be in the form ['abc','def','ghi'], and sets must be in the form {'123','456','789'}.

以下是将CSV格式的数据导入STDIN中OP中提到的 example_table 的示例:

Below is an example of importing CSV formatted data to the example_table mentioned in the OP from STDIN:

cqlsh:demo> copy example_table from STDIN;
[Use \. on a line by itself to end input]
[copy] 12345,"['hello','world']","['stack','overflow']"
[copy] 56780,"['this','is','a','test','list']","['here','is','another','one']"
[copy] \.

2 rows imported in 11.304 seconds.
cqlsh:demo> select * from example_table;

 id    | comma_delimited_str_list  | space_delimited_str_list
-------+---------------------------+--------------------------
 12345 |            [hello, world] |        [stack, overflow]
 56780 | [this, is, a, test, list] | [here, is, another, one]

从CSV档案汇入不正确的格式化清单或设定值引发错误:

Importing incorrect formatted list or set values from a CSV file will raise an error:

cqlsh:demo> copy example_table from STDIN;
[Use \. on a line by itself to end input]
[copy] 9999,"hello","world"
Bad Request: line 1:108 no viable alternative at input ','
Aborting import at record #0 (line 1). Previously-inserted values still present.

上面的输入应该被替换为 9999,['hello'] ,['world']

The above input should be replaced by 9999,"['hello']","['world']":

cqlsh:demo> copy example_table from STDIN;
[Use \. on a line by itself to end input]
[copy] 9999,"['hello']","['world']"
[copy] \.

1 rows imported in 16.859 seconds.
cqlsh:demo> select * from example_table;

 id    | comma_delimited_str_list  | space_delimited_str_list
-------+---------------------------+--------------------------
  9999 |                   [hello] |                  [world]
 12345 |            [hello, world] |        [stack, overflow]
 56780 | [this, is, a, test, list] | [here, is, another, one]

这篇关于将具有多值(集合)属性的CSV导入到Cassandra的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆