如何删除每小时除一条以外的所有记录 [英] how to delete every record except one per hour
问题描述
我有一个具有数百万个具有以下结构的传感器记录的mysql表:
I have a mysql table with millions of sensor records with the following structure:
datanumber (auto increment),
stationid (int),
sensortype (int),
measuredate (datetime),
data (medtext)
每个站点每种传感器类型(2-5个传感器)每2-10分钟添加一条记录
each stations adds a record every 2-10 minute per sensortype (2-5 sensors)
我想每小时只保留一个记录,每个传感器,每个站 而且只有被测者的年龄超过1年.
I would like to keep only one record per hour, per sensor, per station and this too only if measuredate is older than 1 year.
我了解如何选择早于一年的数据,但是除了每小时删除一个行外,我不知道如何删除行.它是每小时保留的第一个,最后一个还是随机值并不重要.我也不需要计算平均值或其他任何东西,只需减少存储的记录量
I understand how to select data older than one year but I have no clue on how to delete rows except one for each hour. It does not really matter if it's the first, last or a random value which is kept at each hour. I also do not need to calculate average values or something, just strip down the amount of records stored
推荐答案
如果我们可以使用row_number over( ... )
,这将是一个引人注目的纽带,但是针对MySQL的解决方案并不困难.对于此类问题,请查看是否可以查询仅要删除的行的列表.听起来很简单.首先,我们要列出每天的每个小时以及该小时的第一个(最少)条目:
This would be a lead-pipe cinch if we could use row_number over( ... )
but a solution for MySQL is not difficult. For problems like this, look to see if we can query a list of just the rows we want to delete. That sounds easy enough. First, we want to have a list of each hour of each day and the first (least) entry for that hour:
select Date( MeasureDate ) TheDate, Hour( MeasureDate ) TheHour, Min( MeasureDate ) MinTime
from T
group by TheDate, TheHour;
因此我们只需要将表重新连接到该结果集:
So we just have to join the table back to this result set:
select T.*
from T
join(
select Date( MeasureDate ) TheDate, Hour( MeasureDate ) TheHour, Min( MeasureDate ) MinTime
from T
group by TheDate, TheHour
) as T1
on T1.MinTime = T.MeasureDate
这为我们提供了所有要保留的行.因此,使用left join
反转结果:
This gives us all the rows we want to keep. So use a left join
to invert the results:
select T.*
from T
left join(
select Date( MeasureDate ) TheDate, Hour( MeasureDate ) TheHour, Min( MeasureDate ) MinTime
from T
group by TheDate, TheHour
) as T1
on T1.MinTime = T.MeasureDate
where T1.MinTime is null;
将select
更改为delete
等中提琴:
delete TDel
from T TDel
left join(
select Date( MeasureDate ) TheDate, Hour( MeasureDate ) TheHour, Min( MeasureDate ) MinTime
from T
group by TheDate, TheHour
) as T1
on T1.MinTime = TDel.MeasureDate
where T1.MinTime is null;
您可以根据需要添加其他字段,例如SensorType
,以保留每个传感器每小时的第一次输入,或者您想对其进行调整. SqlFiddle
You can add other fields such as SensorType
as appropriate to keep first entry of each hour per sensor or however you want to tune it. SqlFiddle
这篇关于如何删除每小时除一条以外的所有记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!