数据仓库原理和 NoSQL [英] Data warehousing principles and NoSQL

查看:15
本文介绍了数据仓库原理和 NoSQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 MongoDB、CouchDB 和相关技术,我们可以获得更快的查询,那么这仍然有效吗?

with MongoDB, CouchDB and related technologies we can get faster querying so is this still valid?

交易数据的副本,专门针对查询和分析进行了重组."(R. Kimball 数据仓库工具包,1996 年

"A copy of transaction data, specially restructured for queries and analyses." (R. Kimball The Data Warehouse Toolkit, 1996

我的意思是,我们真的需要将数据重组为 OLAP 方案以进行查询以进行分析吗?更具体地说,是否可以使用 NoSQL(不一定使用 OLAP 建模)来实现用于分析目的的钻取、切片和切块以及其他报告?我们能否克服 OLAP 的数据子集"查询限制,并使用 NoSQL 报告整个数据世界?

I mean, do we really need to restructure our data to an OLAP scheme to query it for analysis purposes? More specifically can drill-down, slice and dice and other reporting for analysis purposes be achieved with NoSQL (NOT necessarily with OLAP modelling)? Also could we overcome the "data subset" querying limitation of OLAP and report on the whole data universe with NoSQL?

推荐答案

在我看来,OLAP 子集或结构不会消失,并且可能会因为一些原因变得更加普遍.没有特别的顺序: f) Map-reduce 在很多情况下都是你得到的.Mongodb 以更快的聚合管道站稳了脚跟;u) NoSQL 的一个大问题是缺乏连接或关系.这意味着您的基础数据必须丑陋才能支持许多 OLAP 报告;b) 构建丢弃"或易失性数据子集以保持干净的主表/集合是值得的;a) NoSQL 非常适合冗余数据集:不需要创建表甚至模式,启动和终止集合非常简单;r) 对于附加数据集,NoSQL 比 SQL 更容易扩展;d) 初出茅庐的初创公司可以避免支持两种数据库技术(一种用于 OLAP,一种用于 OLTP)所需的成本和资源;并且,b)您会发现您的后端/前端代码使用经过处理的数据集更加容易和可管理;并且,c) 具有自己预制索引的预制数据集具有无与伦比的速度优势.

In my estimation OLAP subsets or structures will not go away and may become more common for a few reasons. In no particular order: f) Map-reduce is all you get in many cases. Mongodb is on a steadier foot with their speedier aggregation-pipeline; u) A big gotcha with NoSQL is the lack of joins or relationships. Meaning that your underlying data has to be ugly in order to support many OLAP reports; b) Its worthwhile constructing 'throw away' or volatile data subsets simply to keep a clean master table/collection; a) NoSQL is perfectly suited for redundant datasets: there are no create table or even schemas needed, its dead simple to spin up and kill collections; r) NoSQL is heaps easier to scale for the additional dataset than SQL; d) A fledgling start-up can avoid the cost and resources needed to support two db technologies ( one for OLAP and one for OLTP ); and, b) you'll find your backend / frontend code much much much easier and manageable with massaged data sets; and, c) the unbeatable speed advantage of premade datasets with their own premade indices.

这篇关于数据仓库原理和 NoSQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆