面向列的NoSQL与面向文档的NoSQL有什么不同? [英] How does column-oriented NoSQL differ from document-oriented?

查看:169
本文介绍了面向列的NoSQL与面向文档的NoSQL有什么不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我读过的三种类型的NoSQL数据库是键值,面向列和面向文档。



键值非常简单 - 一个带有简单值的键。



我看到面向文档的数据库被描述为像键值,但是值可以是一个结构,像一个JSON对象。每个文档可以有全部,一些或没有与另一个相同的键。



列向导似乎非常像文档导向,因为你不指定结构。



那么这两者之间的区别是什么,为什么要使用一个呢?



我特别看了MongoDB和Cassandra。我基本上需要一个动态结构,可以改变,但不影响其他值。同时,我需要能够搜索/过滤特定的键和运行报告。使用CAP,AP对我来说是最重要的。数据可以最终跨节点同步,只要没有冲突或数据丢失。

解决方案

在Cassandra中,每一行(由一个键寻址)包含一个或多个列。列本身是键值对。列名称不需要预定义,即结构不是固定的。行中的列根据其键(名称)按排序顺序存储。



在某些情况下,您可能在一行中有非常大量的列(例如,用作索引以启用特定类型的查询)。 Cassandra可以有效地处理这种大型结构,并且您可以检索特定范围的列。



还有一个称为超列的结构级别(不常用)



您可以将整体结构视为嵌套散列表/字典,具有2或3级关键字。



正常栏族:

  $ b col col col ... 
val val val ...

超级列族:

  row 
supercol supercol ...
(sub)col ...(sub)col(sub)col ...
val val ... val val ...

还有更高级别的结构 - 列族和键空间 - 可用于划分或组合数据。



另请参阅此问题: Cassandra:什么是子列



http://wiki.apache.org/cassandra/ArticlesAndPresentations



Re:与文档导向数据库的比较 - 后者通常插入整个文档(通常为JSON),而在Cassandra中,您可以寻址单独的列或超列,并单独更新它们,即它们以不同的粒度级别工作。每个列都有自己独立的时间戳/版本(用于协调分布式集群中的更新)。

Cassandra列值只是字节,但可以输入ASCII, UTF8文本,数字,日期等。



当然,你可以通过插入包含JSON的列来使用Cassandra作为原始文档存储 - 但是你不会得到所有的功能的一个真正的面向文档的商店。


The three types of NoSQL databases I've read about is key-value, column-oriented, and document-oriented.

Key-value is pretty straight forward - a key with a plain value.

I've seen document-oriented databases described as like key-value, but the value can be a structure, like a JSON object. Each "document" can have all, some, or none of the same keys as another.

Column oriented seems to be very much like document oriented in that you don't specify a structure.

So what is the difference between these two, and why would you use one over the other?

I've specifically looked at MongoDB and Cassandra. I basically need a dynamic structure that can change, but not affect other values. At the same time I need to be able to search/filter specific keys and run reports. With CAP, AP is the most important to me. The data can "eventually" be synced across nodes, just as long as there is no conflict or loss of data. Each user would get their own "table".

解决方案

In Cassandra, each row (addressed by a key) contains one or more "columns". Columns are themselves key-value pairs. The column names need not be predefined, i.e. the structure isn't fixed. Columns in a row are stored in sorted order according to their keys (names).

In some cases, you may have very large numbers of columns in a row (e.g. to act as an index to enable particular kinds of query). Cassandra can handle such large structures efficiently, and you can retrieve specific ranges of columns.

There is a further level of structure (not so commonly used) called super-columns, where a column contains nested (sub)columns.

You can think of the overall structure as a nested hashtable/dictionary, with 2 or 3 levels of key.

Normal column family:

row
    col  col  col ...
    val  val  val ...

Super column family:

row
      supercol                      supercol                     ...
          (sub)col  (sub)col  ...       (sub)col  (sub)col  ...
           val       val      ...        val       val      ...

There are also higher-level structures - column families and keyspaces - which can be used to divide up or group together your data.

See also this Question: Cassandra: What is a subcolumn

Or the data modelling links from http://wiki.apache.org/cassandra/ArticlesAndPresentations

Re: comparison with document-oriented databases - the latter usually insert whole documents (typically JSON), whereas in Cassandra you can address individual columns or supercolumns, and update these individually, i.e. they work at a different level of granularity. Each column has its own separate timestamp/version (used to reconcile updates across the distributed cluster).

The Cassandra column values are just bytes, but can be typed as ASCII, UTF8 text, numbers, dates etc.

Of course, you could use Cassandra as a primitive document store by inserting columns containing JSON - but you wouldn't get all the features of a real document-oriented store.

这篇关于面向列的NoSQL与面向文档的NoSQL有什么不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆