面向列的 NoSQL 与面向文档的 NoSQL 有何不同? [英] How does column-oriented NoSQL differ from document-oriented?

查看:21
本文介绍了面向列的 NoSQL 与面向文档的 NoSQL 有何不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我读过的三种类型的 NoSQL 数据库是键值、面向列和面向文档.

The three types of NoSQL databases I've read about is key-value, column-oriented, and document-oriented.

键值非常简单 - 具有普通值的键.

Key-value is pretty straight forward - a key with a plain value.

我见过将面向文档的数据库描述为键值,但值可以是一个结构,如 JSON 对象.每个文档"可以具有与另一个相同的所有键、某些键或没有键.

I've seen document-oriented databases described as like key-value, but the value can be a structure, like a JSON object. Each "document" can have all, some, or none of the same keys as another.

面向列似乎与面向文档非常相似,因为您没有指定结构.

Column oriented seems to be very much like document oriented in that you don't specify a structure.

那么这两者之间有什么区别,为什么要使用一个而不是另一个?

So what is the difference between these two, and why would you use one over the other?

我专门研究过 MongoDB 和 Cassandra.我基本上需要一个可以改变但不影响其他值的动态结构.同时,我需要能够搜索/过滤特定键并运行报告.有了CAP,AP对我来说是最重要的.只要没有冲突或数据丢失,数据就可以最终"跨节点同步.每个用户都会得到自己的表".

I've specifically looked at MongoDB and Cassandra. I basically need a dynamic structure that can change, but not affect other values. At the same time I need to be able to search/filter specific keys and run reports. With CAP, AP is the most important to me. The data can "eventually" be synced across nodes, just as long as there is no conflict or loss of data. Each user would get their own "table".

推荐答案

在 Cassandra 中,每一行(由一个键寻址)包含一个或多个列".列本身就是键值对.列名不需要预定义,即结构不是固定的.一行中的列根据它们的键(名称)按排序顺序存储.

In Cassandra, each row (addressed by a key) contains one or more "columns". Columns are themselves key-value pairs. The column names need not be predefined, i.e. the structure isn't fixed. Columns in a row are stored in sorted order according to their keys (names).

在某些情况下,您可能在一行中有非常多的列(例如用作索引以启用特定类型的查询).Cassandra 可以有效地处理如此大的结构,并且您可以检索特定范围的列.

In some cases, you may have very large numbers of columns in a row (e.g. to act as an index to enable particular kinds of query). Cassandra can handle such large structures efficiently, and you can retrieve specific ranges of columns.

还有一层结构(不常用)称为超级列,其中一列包含嵌套(子)列.

There is a further level of structure (not so commonly used) called super-columns, where a column contains nested (sub)columns.

您可以将整个结构视为嵌套的哈希表/字典,具有 2 或 3 级键.

You can think of the overall structure as a nested hashtable/dictionary, with 2 or 3 levels of key.

普通列族:

row
    col  col  col ...
    val  val  val ...

超级列族:

row
      supercol                      supercol                     ...
          (sub)col  (sub)col  ...       (sub)col  (sub)col  ...
           val       val      ...        val       val      ...

还有更高级别的结构 - 列族和键空间 - 可用于分割或组合您的数据.

There are also higher-level structures - column families and keyspaces - which can be used to divide up or group together your data.

另见这个问题:Cassandra:什么是子列

或者来自 http://wiki.apache.org/cassandra/ArticlesAndPresentations

Re:与面向文档的数据库的比较 - 后者通常插入整个文档(通常是 JSON),而在 Cassandra 中,您可以处理单个列或超级列,并单独更新它们,即它们在不同的粒度级别上工作.每列都有自己独立的时间戳/版本(用于协调分布式集群中的更新).

Re: comparison with document-oriented databases - the latter usually insert whole documents (typically JSON), whereas in Cassandra you can address individual columns or supercolumns, and update these individually, i.e. they work at a different level of granularity. Each column has its own separate timestamp/version (used to reconcile updates across the distributed cluster).

Cassandra 列值只是字节,但可以输入为 ASCII、UTF8 文本、数字、日期等.

The Cassandra column values are just bytes, but can be typed as ASCII, UTF8 text, numbers, dates etc.

当然,您可以通过插入包含 JSON 的列将 Cassandra 用作原始文档存储 - 但您不会获得真正面向文档的存储的所有功能.

Of course, you could use Cassandra as a primitive document store by inserting columns containing JSON - but you wouldn't get all the features of a real document-oriented store.

这篇关于面向列的 NoSQL 与面向文档的 NoSQL 有何不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆