在 SQL Server 2005/2008 中存储历史数据的最佳方式是什么? [英] What is the best way to store historical data in SQL Server 2005/2008?

查看:45
本文介绍了在 SQL Server 2005/2008 中存储历史数据的最佳方式是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的简化和人为的例子如下:-

My simplified and contrived example is the following:-

假设我想每天测量和存储世界上所有城镇的温度(和其他值).我正在寻找一种存储数据的最佳方式,以便获取所有城镇的当前温度就像获取一个城镇的所有历史温度一样容易.

Lets say that I want to measure and store the temperature (and other values) of all the worlds' towns on a daily basis. I am looking for an optimal way of storing the data so that it is just as easy to get the current temperature in all the towns, as it is to get all the temperature historically in one town.

这是一个很容易解决的问题,但我正在寻找最佳解决方案.

It is an easy enough problem to solve, but I am looking for the best solution.

我能想到的两个主要选项如下:-

The 2 main options I can think of are as follows:-

将所有当前记录和存档记录存储在同一个表中.

Store all the current and archive records in the same table.

CREATE TABLE [dbo].[WeatherMeasurement](
  MeasurementID [int] Identity(1,1) NOT Null,
  TownID [int] Not Null,
  Temp [int] NOT Null,
  Date [datetime] NOT Null,
)

这将使一切变得简单,但是获取城镇列表和当前温度的最有效查询是什么?一旦表中有数百万行,这会扩展吗?在表中添加某种 IsCurrent 标志有什么好处吗?

This would keep everything simple, but what would be the most efficient query to get a list of towns and there current temperature? Would this scale once the table has millions of rows in? Is there anything to be gained by having some sort of IsCurrent flag in the table?

会有一个表来存储当前的实时测量数据

There would be a table to store the current live measurements in

CREATE TABLE [dbo].[WeatherMeasurement](
  MeasurementID [int] Identity(1,1) NOT Null,
  TownID [int] Not Null,
  Temp [int] NOT Null,
  Date [datetime] NOT Null,
)

还有一个用于存储历史存档日期的表(可能由触发器插入)

And a table to store historical archived date (inserted by a trigger perhaps)

CREATE TABLE [dbo].[WeatherMeasurementHistory](
  MeasurementID [int] Identity(1,1) NOT Null,
  TownID [int] Not Null,
  Temp [int] NOT Null,
  Date [datetime] NOT Null,
)

这样做的优点是可以保持主要当前数据的精简,并且查询非常高效,但代价是使架构更复杂,插入数据的成本更高.

This has the advantages of keeping the main current data lean, and very efficient to query, at the expense of making the schema more complex and inserting data more expensive.

哪个是最好的选择?有没有我没有提到的更好的选择?

Which is the best option? Are there better options I haven't mentioned?

注意:我已经简化了架构以帮助更好地关注我的问题,但假设每天都会插入大量数据(100,000 条记录),并且数据是一天的最新数据.当前数据和历史数据一样容易被查询.

NOTE: I have simplified the schema to help focus my question better, but assume there will be alot of data inserted each day (100,000s of records), and data is current for one day. The current data is just as likely to be queried as the historical.

推荐答案

这取决于应用程序的使用模式...一张表...但如果历史查询是例外,(或少于 10% 的查询),并且更常见的当前值查询的性能将受到将所有数据放在一张表中的影响,那么分开是有意义的将数据放入自己的表中...

it DEPENDS on the applications usage patterns... If usage patterns indicate that the historical data will be queried more often than the current values, then put them all in one table... But if Historical queries are the exception, (or less than 10% of the queries), and the performance of the more common current value query will suffer from putting all data in one table, then it makes sense to separate that data into it's own table...

这篇关于在 SQL Server 2005/2008 中存储历史数据的最佳方式是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆