选择最大速度的SQL Server数据类型 [英] Choosing SQL Server data types for maximum speed

查看:132
本文介绍了选择最大速度的SQL Server数据类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在设计一个需要针对最大速度进行优化的数据库。

I'm designing a database that will need to be optimized for maximum speed.

所有的数据库数据都是从我所称的输入数据库保存我正在编辑的数据,主要是一些折线,标记等谷歌地图)。

All the database data is generated once from something I call an input database (which holds the data I'm editing, mainly some polylines, markers, etc for google maps).

所以数据库不需要编辑,但它需要保持为许多数据可以快速向用户显示结果(跨越城市的路线,自定义折线等)。

So the database is not subject to editing, but it needs to hold as many data as it can for quickly displaying results to the user (routes across town, custom polylines, etc).

问题是:选择较小的数据类型,例如smallint int会提高性能还是会影响呢?空间不是一个问题,经过一些快速的计算,数据库不会超过200mb,不会有超过100.000行的表(平均将在5.000左右)。

The question is: choosing smaller data types for example like smallint over int will improve performance or it will affect it? Space is not quite a problem, after some quick calculations, the database will not exceed 200mb, and there will not be tables with more than 100.000 rows (average will be around 5.000).

我问这个是因为我在互联网上阅读了一些文章,有些则表示,较小的数据类型可以提高性能,其他人则表示会影响它,因为必须进行额外的处理。我知道,对于较小的数据库,可能结果并不明显,但是我对此感兴趣,因为我期待许多请求会触发更多查询。

I'm asking this because I read some articles around the internet and some say that smaller data types improve performance others say that it affects it because additional processing must be done. I'm aware that for smaller databases probably results are not noticeable, but I'm interested in every bit because I'm expecting many requests which will trigger a lot more queries.

主机环境将是具有SQL Server 2008 R2的Windows Server 2008 R2。

The hosting environment is gonna be Windows Server 2008 R2 with SQL Server 2008 R2.

编辑1:只是为了给你一个例子,我还没有一个正确的桌面结构:
我将要有一张表,它将保留公共交通线路(约200个地方),由现实生活中唯一的数字确定,并将要在各种表格中引用,并将进行各种操作。这些引用表将保存最多的数据。

EDIT 1: Just to give you an example because I don't have a proper table structure yet: I'm going to have a table which will hold public transportation lines (somewhere around 200), identified by a unique number in real life, and which is going to be referenced in all sorts of tables and on which all sorts of operations are going to be made. These referencing tables will hold the largest amount of data.

由于行有唯一的数字,我想到了3个设计示例:

Because lines have unique numbers, I have thought of 3 examples of designs:


  1. PK是数据类型的行号:smallint

  1. The PK is the line number of datatype: smallint

PK是行号的数据类型:int

The PK is the line number of datatype: int

PK是不同的(例如身份),行号存储在不同的字段中。

The PK is something different (identity for example) and the line number is stored in a different field.

只是出于参数的考虑,因为我在不进行优化的输入数据库上使用这个,PK是一个GUID(16字节);如果你喜欢,你可以比较这个与其他人相比有多差异,如果真的是

Just for the sake of argument, because I used this on the 'input database' which is not subject to optimization, the PK is a GUID (16 bytes); if you like, you can make a comparison of how bad is this compared to others, if it really is

所以请记住,PK将至少在15个表中引用,其中一些将具有超过50.000行(如上所述,其余的平均为5.000),这将是不断的查询和操纵,我'对我可以得到的每一点速度感兴趣。

So keep in mind that the PK is going to be referenced in at least 15 tables, some of which will have over 50.000 rows (the rest averaging 5.000 as I said above) which are going to be subject to constant querying and manipulation, and I'm interested in every bit of speed that I can get.

如果需要,我可以更详细地说明。感谢

I can detail this even more if you need. Thanks

编辑2:另一个与此相关的问题出现在我的头脑中,认为适合这个讨论:

EDIT 2: And another question related to this came to my mind, think it fits into this discussion:

如果我使用.NET应用程序内的本机SQL查询,而不是使用LINQ to SQL,我会在这个特定场景中看到任何性能改进 我知道LINQ是经过优化的,并且在性能方面产生了非常好的查询,但仍然值得一提。再次感谢

Will I see any performance improvements in this specific scenario if I use native SQL queries from inside my .NET application rather than using LINQ to SQL? I know LINQ is strongly optimized and generates very good queries performance-wise, but still, sure worth asking. Thanks again.

推荐答案

你能指出一些文章说,较小的数据类型=更多的处理?请记住,即使使用SSD,今天大多数工作负载都是I / O绑定(或内存限制),而不是CPU限制。

Can you point to some articles that say that smaller data types = more processing? Keeping in mind that even with SSDs most workloads today are I/O-bound (or memory-bound) and not CPU-bound.

特别是在PK为将在许多表中引用,使用尽可能小的数据类型将是有益的。在这种情况下,如果这是一个 SMALLINT 那么这就是我会使用的(尽管你说有大约200个值,所以理论上你可以使用 TINYINT 这是一半的大小,并支持0-255)。你需要谨慎的地方是,如果你不是100%肯定总是有〜200个值。一旦你需要256,你将不得不更改所有受影响的表中的数据类型,这将是一个痛苦。因此,有时候可以在适应未来增长和挤压今天绝对最佳表现之间取得平衡。如果你不确定你永远不会超过255或32,000个值,那么我可能只是一个 INT 。除非你也不知道你不会超过20亿的价值,在这种情况下你会使用 BIGINT

Particularly in cases where the PK is going to be referenced in many tables, it will be beneficial to use the smallest data type possible. In this case if that's a SMALLINT then that's what I would use (though you say there are about 200 values, so theoretically you could use TINYINT which is half the size and supports 0-255). Where you need to exercise caution is if you aren't 100% sure that there will always be ~200 values. Once you need 256 you're going to have to change the data type in all of the affected tables, and this is going to be a pain. So sometimes a trade-off is made between accommodating future growth and squeezing the absolute most performance today. If you don't know for certain that you will never exceed 255 or 32,000 values then I would probably just an INT. Unless you also don't know that you won't ever exceed 2 billion values, in which case you would use BIGINT.

INT / SMALLINT / TINYINT INT 值都适合 SMALLINT / TINYINT ,尽管在​​后一种情况下,真的是可以忽略不计,因为值是唯一的)另一方面,在性能和磁盘空间中,任何这些和 GUID 之间的区别将会更加明显。马克与金佰利有很大的联系; 我写了这篇文章 2003年,虽然有点过时,但它确实包含了今天仍然相关的大部分要点。

The difference between INT/SMALLINT/TINYINT is going to be more noticeable in disk space than in performance. (And if you're on Enterprise, the differences in both disk space and performance can be offset quite a bit using data compression - particularly while your INT values all fit within SMALLINT/TINYINT, though in the latter case it really will be negligible because the values are unique.) On the other hand, the difference between any of these and GUID is going to be much more noticeable in both performance and disk space. Marc gave some great links from Kimberly; I wrote this article in 2003 and while it's a little dated it does contain most of the salient points that are still relevant today.

有时需要考虑的另一个折衷(虽然不在您的具体情况,似乎)是否需要在多个系统中是唯一的值。这是您可能需要牺牲一些性能以满足业务需求的地方。在很多情况下,人们采取简单的方式,并将自己辞呈为 GUID 。但也有其他解决方案,如身份范围,中央自定义序列生成器和SQL Server 2012中的新的 SEQUENCE 对象。我写了关于 SEQUENCE 在2010年发布SQL Server 2012的第一个公开测试版时。

Another trade-off that sometimes needs to be considered (though not in your specific case, it seems) is whether values need to be unique across multiple systems. This is where you might need to sacrifice some performance in order to meet business requirements. In a lot of cases folks take the easy way and resign themselves to GUID. But there are other solutions too, such as identity ranges, a central custom sequence generator, and the new SEQUENCE object in SQL Server 2012. I wrote about SEQUENCE back in 2010 when the first public beta of SQL Server 2012 was released.

这篇关于选择最大速度的SQL Server数据类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆