在Cassandra列中存储JSON字符串的有效方法? [英] Efficient way to store a JSON string in a Cassandra column?

查看:287
本文介绍了在Cassandra列中存储JSON字符串的有效方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Cassandra新手问题。我正在使用REST呼叫从社交网站收集一些数据。因此,我最终获得了以JSON格式返回的数据。

Cassandra newbie question. I'm collecting some data from a social networking site using REST calls. So I end up with the data coming back in JSON format.

JSON只是我表中的一列。我试图弄清楚存储JSON字符串的最佳做法是什么。

The JSON is only one of the columns in my table. I'm trying to figure out what the "best practice" is for storing the JSON string.

首先,我想到了使用 map 类型,但是JSON包含字符串,数字类型等的混合。就像我可以为地图键/值声明通配符类型一样。 JSON字符串可能很大,可能超过10KB。我可以将其存储为字符串,但似乎效率不高。我认为这是一项常见的任务,所以我确定有一些通用的准则。

First I thought of using the map type, but the JSON contains a mix of strings, numerical types, etc. It doesn't seem like I can declare wildcard types for the map key/value. The JSON string can be quite large, probably over 10KB in size. I could potentially store it as a string, but it seems like that would be inefficient. I would assume this is a common task, so I'm sure there are some general guidelines for how to do this.

我知道Cassandra对JSON具有本机支持,但是据我了解,这通常在整个JSON映射与数据库模式匹配1-1时使用。对我来说不是这样。该模式有一堆列,而JSON字符串只是一种有效载荷。将JSON字符串存储为 blob text 更好吗?顺便说一句,Cassandra版本是2.1.5。

I know Cassandra has native support for JSON, but from what I understand, that's mostly used when the entire JSON map matches 1-1 with the database schema. That's not the case for me. The schema has a bunch of columns and the JSON string is just a sort of "payload". Is it better to store the JSON string as a blob or as text? BTW, the Cassandra version is 2.1.5.

任何提示都值得赞赏。提前致谢。

Any hints appreciated. Thanks in advance.

推荐答案

在Cassandra存储引擎中,blob和文本之间实际上并没有太大区别,因为Cassandra本质上将文本存储为blob 。是的,您所说的本机 JSON支持仅适用于数据模型与JSON模型匹配的情况,并且仅适用于Cassandra 2.2 +。

In the Cassandra Storage engine there's really not a big difference between a blob and a text, since Cassandra stores text as blobs essentially. And yes the "native" JSON support you speak of is only for when your data model matches your JSON model, and it's only in Cassandra 2.2+.

我会存储它作为文本类型,发送数据时(或处理解压缩时),您无需执行任何操作即可压缩JSON数据。由于Cassandra的Binary Protocol支持执行传输压缩。还要确保您的表存储的数据已压缩相同的压缩算法(我建议使用LZ4,因为它是最快的算法),可以节省每个读取请求的压缩时间。因此,如果您配置存储压缩的数据并使用传输压缩,则您甚至都不必自己实现。

I would store it as a text type, and you shouldn't have to implement anything to compress your JSON data when sending the data (or handle uncompressing). Since Cassandra's Binary Protocol supports doing transport compression. Also make sure your table is storing the data compressed with the same compression algorithm (I suggest using LZ4 since it's the fastest algo implmeneted) to save on doing compression for each read request. Thus if you configure storing the data compressed and use transport compression, you don't even have to implement either yourself.

您没有说您使用的是哪个客户端驱动程序,但这是有关如何为 Datastax Java客户端驱动程序

You didn't say which Client Driver you're using, but here's the documentation on how to setup Transport Compression for Datastax Java Client Driver.

这篇关于在Cassandra列中存储JSON字符串的有效方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆