数据仓库和数据仓库之间的实际区别是什么?大数据? [英] What is the actual difference between Data Warehouse & Big Data?

查看:159
本文介绍了数据仓库和数据仓库之间的实际区别是什么?大数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道什么是数据仓库&什么是大数据. 但是我对数据仓库与大数据感到困惑. 两者都具有相同的名称或不同的名称(在概念上和物理上).

I know what is Data Warehouse & what is Big Data. But I am confused with Data Warehouse Vs Big Data. Both are same with different names or both are different(Conceptually & Physically).

推荐答案

我知道这是一个较旧的主题,但在过去一年左右的时间里有了一些发展.将数据仓库与Hadoop进行比较就像将苹果与桔子进行比较.数据仓库是一个概念:高质量的干净,集成的数据.我认为对数据仓库的需求不会很快消失.另一方面,Hadoop是一项技术.它是用于处理大量数据的分布式计算框架.过去,数据仓库通常建立在关系数据库和数据仓库设备上.但是,在过去的几年中,RDBMS出现了各种局限性(面对不断增长的数据量,许可证成本激增,不适合用于查询图形和层次结构以及摄取非结构化数据类型等).同时,出现了Hadoop上的MPP SQL查询引擎,例如Apache Drill,现在使查询位于Hadoop上的数据成为可能.

I know that this is an older thread but there have been some developments in the last year or so. Comparing the data warehouse to Hadoop is like comparing apples to oranges. The data warehouse is a concept: clean, integrated data of high quality. I don't think the need for a data warehouse will go away anytime soon. Hadoop on the other hand is a technology. It is a distributed compute framework to process large volumes of data. In the past data warehouses were typically built on relational databases and data warehouse appliances. However, over the last couple of years various limitations of the RDBMS have emerged (exploding license costs in the face of growing data volumes, poor fit for purpose for querying graphs and hierarchies and ingesting unstructured data types etc.). At the same time MPP SQL query engines on Hadoop have appeared such as Apache Drill that now make it possible to query data that sits on Hadoop.

如果您对所有细节都感兴趣,我已经撰写了有关该主题的一系列文章. 大数据时代.时代的终结?

I have written a whole series of posts on the subject if you are interested in all of the details. Data Warehousing in the age of big data. The end of an era?

这篇关于数据仓库和数据仓库之间的实际区别是什么?大数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆