在 Cassandra 列中存储 JSON 字符串的有效方法? [英] Efficient way to store a JSON string in a Cassandra column?

查看:27
本文介绍了在 Cassandra 列中存储 JSON 字符串的有效方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Cassandra 新手问题.我正在使用 REST 调用从社交网站收集一些数据.所以我最终以 JSON 格式返回数据.

Cassandra newbie question. I'm collecting some data from a social networking site using REST calls. So I end up with the data coming back in JSON format.

JSON 只是我表中的一列.我试图弄清楚存储 JSON 字符串的最佳实践"是什么.

The JSON is only one of the columns in my table. I'm trying to figure out what the "best practice" is for storing the JSON string.

首先我想到使用 ma​​p 类型,但 JSON 包含字符串、数字类型等的混合.似乎我不能为地图键/值声明通配符类型.JSON 字符串可能非常大,大小可能超过 10KB.我可以将它存储为一个字符串,但这似乎效率低下.我认为这是一项常见任务,因此我确信有一些关于如何执行此操作的一般准则.

First I thought of using the map type, but the JSON contains a mix of strings, numerical types, etc. It doesn't seem like I can declare wildcard types for the map key/value. The JSON string can be quite large, probably over 10KB in size. I could potentially store it as a string, but it seems like that would be inefficient. I would assume this is a common task, so I'm sure there are some general guidelines for how to do this.

我知道 Cassandra 对 JSON 有本机支持,但据我所知,这主要用于整个 JSON 映射与数据库模式 1-1 匹配时.对我来说不是这样.模式有一堆列,JSON 字符串只是一种有效负载".将 JSON 字符串存储为 blob 还是 text 更好?顺便说一句,Cassandra 版本是 2.1.5.

I know Cassandra has native support for JSON, but from what I understand, that's mostly used when the entire JSON map matches 1-1 with the database schema. That's not the case for me. The schema has a bunch of columns and the JSON string is just a sort of "payload". Is it better to store the JSON string as a blob or as text? BTW, the Cassandra version is 2.1.5.

任何提示表示赞赏.提前致谢.

Any hints appreciated. Thanks in advance.

推荐答案

在 Cassandra 存储引擎中,blob 和文本之间确实没有太大区别,因为 Cassandra 本质上将文本存储为 blob.是的,您所说的原生"JSON 支持仅适用于您的数据模型与 JSON 模型匹配的情况,而且仅适用于 Cassandra 2.2+.

In the Cassandra Storage engine there's really not a big difference between a blob and a text, since Cassandra stores text as blobs essentially. And yes the "native" JSON support you speak of is only for when your data model matches your JSON model, and it's only in Cassandra 2.2+.

我会将其存储为文本类型,并且在发送数据(或处理解压缩)时,您不必实施任何压缩 JSON 数据的操作.由于 Cassandra 的二进制协议支持进行传输压缩.还要确保您的表存储 数据压缩相同的压缩算法(我建议使用 LZ4,因为它是最快的算法实现)以节省为每个读取请求进行压缩.因此,如果您配置存储压缩数据并使用传输压缩,您甚至不必自己实现.

I would store it as a text type, and you shouldn't have to implement anything to compress your JSON data when sending the data (or handle uncompressing). Since Cassandra's Binary Protocol supports doing transport compression. Also make sure your table is storing the data compressed with the same compression algorithm (I suggest using LZ4 since it's the fastest algo implmeneted) to save on doing compression for each read request. Thus if you configure storing the data compressed and use transport compression, you don't even have to implement either yourself.

您没有说明您使用的是哪个客户端驱动程序,但这里有关于如何为 Datastax Java 客户端驱动程序.

You didn't say which Client Driver you're using, but here's the documentation on how to setup Transport Compression for Datastax Java Client Driver.

这篇关于在 Cassandra 列中存储 JSON 字符串的有效方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆