数据仓库原理和NoSQL [英] Data warehousing principles and NoSQL

查看:216
本文介绍了数据仓库原理和NoSQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用MongoDB,CouchDB和相关技术,我们可以获得更快的查询,所以这仍然有效吗?

with MongoDB, CouchDB and related technologies we can get faster querying so is this still valid?

交易数据的副本,专门针对查询和分析进行了重组." (R. Kimball数据仓库工具包,1996

"A copy of transaction data, specially restructured for queries and analyses." (R. Kimball The Data Warehouse Toolkit, 1996

我的意思是,我们是否真的需要将数据重组为OLAP方案以查询数据以进行分析?更具体地说,是否可以使用NoSQL来实现用于分析目的的向下钻取,切片和切块以及其他报告(对于OLAP建模不一定如此)?还可以克服OLAP的数据子集"查询限制,并使用NoSQL报告整个数据领域吗?

I mean, do we really need to restructure our data to an OLAP scheme to query it for analysis purposes? More specifically can drill-down, slice and dice and other reporting for analysis purposes be achieved with NoSQL (NOT necessarily with OLAP modelling)? Also could we overcome the "data subset" querying limitation of OLAP and report on the whole data universe with NoSQL?

推荐答案

据我估计,OLAP子集或结构不会消失,由于某些原因,它们可能会变得更加普遍.没有特殊顺序:f)在许多情况下,Map-reduce就是您所能获得的. Mongodb拥有更快的聚合管道,因此步伐更加稳定. u)NoSQL的一大难题是缺少联接或关系.这意味着您的基础数据很难,以支持许多OLAP报告; b)仅仅为了保持干净的主表/集合而构造丢弃"或易失数据子集是有价值的; a)NoSQL非常适合冗余数据集:不需要创建表或什至不需要架构,它的死法很容易启动和销毁集合; r)NoSQL比SQL更易于扩展用于其他数据集的堆; d)刚起步的初创企业可以避免支持两种数据库技术(一种用于OLAP,另一种用于OLTP)所需的成本和资源;并且,b)您将发现经过处理的数据集将使您的后端/前端代码更加容易和易于管理; c)具有自己的预制索引的预制数据集无与伦比的速度优势.

In my estimation OLAP subsets or structures will not go away and may become more common for a few reasons. In no particular order: f) Map-reduce is all you get in many cases. Mongodb is on a steadier foot with their speedier aggregation-pipeline; u) A big gotcha with NoSQL is the lack of joins or relationships. Meaning that your underlying data has to be ugly in order to support many OLAP reports; b) Its worthwhile constructing 'throw away' or volatile data subsets simply to keep a clean master table/collection; a) NoSQL is perfectly suited for redundant datasets: there are no create table or even schemas needed, its dead simple to spin up and kill collections; r) NoSQL is heaps easier to scale for the additional dataset than SQL; d) A fledgling start-up can avoid the cost and resources needed to support two db technologies ( one for OLAP and one for OLTP ); and, b) you'll find your backend / frontend code much much much easier and manageable with massaged data sets; and, c) the unbeatable speed advantage of premade datasets with their own premade indices.

这篇关于数据仓库原理和NoSQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆