Amazon Redshift外键-排序或交错键 [英] Amazon Redshift Foreign Keys - Sort or Interleaved Keys

查看:192
本文介绍了Amazon Redshift外键-排序或交错键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们计划将OLTP关系表导入AWS Redshift. CustomerTransaction表联接到多个查找表.我只包括3个,但我们还有更多.

We plan to import OLTP Relational tables into AWS Redshift. The CustomerTransaction table joins to multiple lookup tables. I only included 3, but we have more.

客户交易表上的排序键应该是什么?在常规SQL Server中,我们在CustomerTransaction表的外键上具有非聚集索引. 对于AWS Redshift,我应该对CustomerTransaction中的外键列使用复合排序键还是交错排序?此表设计的最佳索引策略是什么. 谢谢,

What should Sort Key be on Customer Transaction Table? In regular SQL server, we have nonclustered indexes on the foreign keys in CustomerTransaction table. For AWS Redshift, Should I use compound sort keys or interleaved sort on foreign key columns in CustomerTransaction? What is the best indexing strategy for this table design. Thanks,

create table.dbo CustomerTransaction
{
    CustomerTransactionId bigint primary key identity(1,1),
    ProductTypeId bigint,   -- foreign keys to Product Type Table
    StatusTypeID bigint         -- Foreign keys to StatusTypeTable
    DateOfPurchase date,
    PurchaseAmount float,
    ....
}

create table dbo.ProductType
{
    CustomerTransactionId bigint primary key identity(1,1),
    ProductName varchar(255),
    ProductDescription varchar(255)
    .....
}

create table dbo.StatusType
{
    StatusTypeId bigint primary key identity(1,1),
    StatusTypeName varchar(255),
    StatusDescription varchar(255)
    .....

}

推荐答案

一般经验法则是:

  • 根据通常的GROUP BY
  • 设置DISTKEY
  • 根据您在WHERE语句中常用的设置来设置SORTKEY
  • 避免使用交错排序键(它们仅在极少数情况下才是最佳选择,并且需要频繁使用VACUUM)
  • Set the DISTKEY based on what you commonly GROUP BY
  • Set the SORTKEY based on what you commonly use in WHERE statements
  • Avoid Interleaved Sort Keys (they are only optimal in rare circumstances and require frequent VACUUM)

来自选择最佳发行版样式-Amazon Redshift :

  • 将事实表和一维表分布在它们的公用列上
  • 根据过滤后的数据集的大小选择最大维度
  • 在过滤后的结果集中选择基数高的列
  • 更改某些维度表以使用ALL分布

因此,推荐特定的DISTKEYSORTKEY并不容易,因为这取决于您使用故事的方式.仅看到DDL不足以建议最佳的优化表的方法.

So, it is not easy to recommend a particular DISTKEY and SORTKEY because it depends on how you use the tales. Merely seeing the DDL is not sufficient to recommend the best way to optimize the tables.

其他参考文献:

  • Amazon Redshift Best Practices for Designing Tables
  • Top 10 Performance Tuning Techniques for Amazon Redshift | AWS Big Data Blog

这篇关于Amazon Redshift外键-排序或交错键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆