关系数据库与维数据库有什么区别? [英] Relational vs. Dimensional Databases, what's the difference?

查看:261
本文介绍了关系数据库与维数据库有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想了解OLAP和数据仓库,我对关系和维度建模之间的区别感到困惑。

I'm trying to learn about OLAP and data warehousing, and I'm confused about the difference between relational and dimensional modeling. Is dimensional modeling basically relational modeling, but allowing for redundant/un-normalized data?

例如,假设我有(产品,城市,#销售)的历史销售数据)。我理解以下是一个关系的观点:

For example, let's say I have historical sales data on (product, city, # sales). I understand that the following would be a relational point-of-view:


Product | City | # Sales
Apples, San Francisco, 400
Apples, Boston, 700
Apples, Seattle, 600
Oranges, San Francisco, 550
Oranges, Boston, 500
Oranges, Seattle, 600

虽然以下是更具维度的观点:

While the following is a more dimensional point-of-view:


Product | San Francisco | Boston | Seattle
Apples, 400, 700, 600
Oranges, 550, 500, 600

但是仍然以相同的星形模式实现观点:

But it seems like both points of view would nonetheless be implemented in an identical star schema:


Fact table: Product ID, Region ID, # Sales
Product dimension: Product ID, Product Name
City dimension: City ID, City Name

每个维度的额外详细信息,差异开始弹出。例如,如果你想跟踪区域,关系数据库将倾向于有一个单独的区域表,以保持一切正常化:

And it's not until you start adding some additional details to each dimension that the differences start popping up. For instance, if you wanted to track regions as well, a relational database would tend to have a separate region table, in order to keep everything normalized:


City dimension: City ID, City Name, Region ID
Region dimension: Region ID, Region Name, Region Manager, # Regional Stores

虽然维数据库允许非标准化将区域数据保存在城市维度内,以便更轻松地对数据进行切片:

While a dimensional database would allow for denormalization to keep the region data inside the city dimension, in order to make it easier to slice the data:


City dimension: City ID, City Name, Region Name, Region Manager, # Regional Stores

这是否正确?

推荐答案

星型模式真正在于数据的关系模型和数据的维度模型的交集。它实际上是一个从维模型开始,并将其映射到SQL表,有点类似于从一个关系模型开始时获得的SQL表。

A star schema really lies at the intersection of the relational model of data and the dimensional model of data. It's really a way of starting with a dimensional model, and mapping it into SQL tables that somewhat resemble the SQL tables you get if you start from a relational model.

我说有点类似,因为许多关系设计方法导致归一化设计或至少近似归一化设计。星形模式将与完全规范化有显着的偏离。

I say somewhat resemble because many relational design methodologies result in a normalized design, or at least a nearly normalized design. A star schema will have significant departures from full normalization.

每次离开完全正常化都会导致数据更新异常。 (我包括在一个伞下插入,更新和删除操作的异常)。这些异常与您开始使用的数据模型没有任何关系。

Every departure from full normalization carries with it a consequent data update anomaly. (I'm including anomlaies on insert, update and delete operations under one umbrella). Those anomalies don't have anything to do with what data model you started with.

OLTP与OLAP的注释在这里是相关的。更新异常对这两种情况下的性能和/或编程难度有不同的影响。

The comment on OLTP versus OLAP is relevant here. Update anomalies will have different impacts on performance and/or programming difficulty in those two situations.

除了SQL数据库中的星型模式,还有维度数据库产品其中存储以该产品唯一的物理形式的数据。使用这些产品,您看不到星形模式,因为您看到了维度模型的直接实现,以及可能是产品特有的界面。其中一些接口允许OLAP操作完全点击。

In addition to a star schema in an SQL databaase, there are dimensional database products out there that store data in a physical form that is unique to that product. With those products, you don't see a star schema so much as you see a direct implementation of the dimensional model, and an interface that might be peculiar to the product. Some of those interfaces allow OLAP operations to be completely point-and-click.

作为一个离题的问题,我曾经建立一个星型模式作为一个中间步骤支持基于事务的应用程序的OLTP数据库和Cognos PowerPlay中的数据管理器。使用标准ETL技术,从OLTP数据库到星型模式,然后从星型模式到数据多维数据集的组合传输实际上优于从OLTP数据库到数据立方体的直接传输。这是一个意想不到的结果。

Just as a digression from your question, I once built a star schema as an intermediate step between an OLTP database that supported a transaction based application and a datacube inside Cognos PowerPlay. Using standard ETL techniques, the combined transfer from the OLTP database to the star schema and then from the star schema to the data cube actually outperformed the direct transfer from the OLTP database to the datacube. This was an unexpected result.

希望这有助于。

这篇关于关系数据库与维数据库有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆