使用LINQ和EF清理表 [英] Table cleanup with LINQ and EF
问题描述
大家好,
我正在为数据库的清理例程工作。我决定使用EF,因为我没有存储过程的背景,也没有复杂的SQL语句,所以我认为使用实体和LINQ是一种方法。
我必须清理的表包含历史信息,每分钟存储。现在我必须清理它以保存记录,只有当存储的值发生变化或者新的一天开始时。今天在数据库
上有超过800万条记录。
总结一下,我的表格列是:
ID(主键) ),DATE(字符串),TIME(字符串),VAL1(int),VAL2(浮动)等。
因此,在一天(DATE)期间,我必须删除所有重复的记录。 重复表示VAL1,VAL2等与前一记录完全相同。例如,今天我可能有:
Row-> 1 | 5/5/2010 | 0000 | 23 | 2.4
行 - > 2 | 5/5/2010 | 0001 | 23 | 2.4
行 - > 3 | 5/5/2010 | 0002 | 23 | 3.0
行 - > 4 | 5/5/2010 | 0000 | 23 | 3.0
行 - > 5 | 5/6/2010 | 0000 | 23 | 3.0
清理后,我将:
Row-> 1 | 5/5/2010 | 0000 | 23 | 2.4
行 - > 3 | 5/5/2010 | 0002 | 23 | 3.0
行 - > 5 | 5/6/2010 | 0000 | 23 | 3.0
如何使用LINQ和EF执行此操作而无需迭代表中的所有行?
提前感谢,
Igor。
软件开发人员和AI爱好者。 www.twitter.com/ikondrasovas
嗨伊戈尔,
< span style ="font-size:12pt"> 我的参考文献有一个解决方法。
首先,我们查询所有有资格删除的ID值。
然后我们生成一些新实体,其中包含的主键ID等于前一个查询结果中的主键ID,并将它们附加到上下文中。
在这些实体上调用.DeleteObject API之后,我们使用.SaveChanges()逐个删除它们。
以下是示例代码:
================================== ===================================
;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
使用 ( TestDBEntities 上下文=
new TestDBEntities ())
;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
{
var query = 来自 d1
in context.DeleteTableTests
<跨度风格="">&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP ;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
来自 d2 context.DeleteTableTests
&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
其中 d1.ID == d2.ID - 1&&
&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
d1.DATE == d2.DATE&&
&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
d1.VAL1 == d2。 VAL1&&
&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
d1.VAL2 == d2.VAL2
<跨度风格="">&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
选择 d2.ID;
<跨度风格= "">&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; &NBSP;
foreach ( var id
{
<跨度风格="">&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
var delete = new
DeleteTableTest {ID = id};
<跨度风格="">&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP; &NBSP;&NBSP;
context.DeleteTableTests.Attach(删除);
<跨度风格="">&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
context.DeleteObject(删除);
&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
}
&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
context.SaveChanges();
}
< span style ="font-family:Calibri"> ========================================= ==============================
此处该表名为DeleteTableTest。 ;
另外,我在VS2010中使用EF4。 如果您使用的是VS2008和EFv1,则代码应该类似。
唯一的区别是附加方法。
我们需要在EFv1中使用此API, http://msdn.microsoft.com/en-us/library/system.data.objects.objectcontext.attach.aspx 。
有关在实体框架中附加和分离的其他信息,
http://msdn.microsoft。 com / zh-CN / library / bb896271.aspx 。
注意:由于数据库中有超过800万条记录,我建议你让上述代码在一定范围内执行,例如&NBSP;
我们首先处理ID小于10000,然后10000到20000等的记录。
此外,这里的一个缺点是EF将逐个删除记录(每个记录使用一个DELETE命令),因此它可能会增加客户端和数据库服务器之间的流量。&NBSP;&NBSP;
另一种解决方法是直接使用SQL语句或存储过程,并且可以在一次数据库调用中完成。 ;
SQL语句可以是:
================================ =====================================
DELETE
FROM DeleteTableTest WHERE ID
IN (
SELECT
[Extent2] 。 [ID]
AS [ID]
FROM
[dbo] 。 [DeleteTableTest]
AS [Extent1]
INNER
JOIN [dbo] 。 [DeleteTableTest]
AS [Extent2] ON ( [ Extent1] 。 [日期]
= [Extent2] 。 [日期] )
AND ( [Extent1] 。 [VAL1]
= [Extent2] 。 [VAL1] )
AND ( [Extent1] 。 [VAL2]
= [Extent2] 。 [VAL2] )
WHERE [Extent1] 。 [ID]
= ( [Extent2] 。 [ID]
- 1 ))
=== ================================================== ================
如果您有任何疑问,请随时告诉我。
祝你有美好的一天!
最好的问候, &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;
Lingzhi Sun
MSDN订阅者支持
在论坛
如果您对我们的支持有任何反馈,请联系
msdnmg@microsoft.com
Hello all,
I'm curently working in a cleanup routine for a database. I decided using EF because I don't have background on stored procedures nor complex SQL statements, so I think using entities and LINQ would be a way to go.
The table I must clean contains historic information stored at every minute. Now I must clean it up to keep records only when there is a change on the values stored OR if a new day has started. Today there are more than 8 million records on the database.
To summarize, my table columns are:
ID (primary key), DATE (string), TIME(string), VAL1(int), VAL2(float), etc.
So, during a day (DATE) I must delete all duplicated records. Duplicated means when VAL1, VAL2, etc are exctly the same as the previous record. For instance, today I might have:
Row-> 1 | 5/5/2010 | 0000 | 23 | 2.4
Row-> 2 | 5/5/2010 | 0001 | 23 | 2.4
Row-> 3 | 5/5/2010 | 0002 | 23 | 3.0
Row-> 4 | 5/5/2010 | 0000 | 23 | 3.0
Row-> 5 | 5/6/2010 | 0000 | 23 | 3.0
After cleanup, I will have:
Row-> 1 | 5/5/2010 | 0000 | 23 | 2.4
Row-> 3 | 5/5/2010 | 0002 | 23 | 3.0
Row-> 5 | 5/6/2010 | 0000 | 23 | 3.0
How can I use LINQ and EF to perform this operation without having to iterate over all rows on the table?
Thanks in advance,
Igor.
Software Developer and AI Enthusiast. www.twitter.com/ikondrasovas
Hi Igor,
I have one workaround for your references. First, we query all the ID values which are qualified to be deleted. Then we generate some new entities which contains the primary key ID that is equal to the ones in the former query result, and attach them to the context. After calling .DeleteObject API on these entities, we use .SaveChanges() to delete them one by one. Here are the sample codes:
=====================================================================
using (TestDBEntities context = new TestDBEntities())
{
var query = from d1 in context.DeleteTableTests
from d2 in context.DeleteTableTests
where d1.ID == d2.ID - 1 &&
d1.DATE == d2.DATE &&
d1.VAL1 == d2.VAL1 &&
d1.VAL2 == d2.VAL2
select d2.ID;
foreach (var id in query)
{
var delete = new DeleteTableTest { ID = id };
context.DeleteTableTests.Attach(delete);
context.DeleteObject(delete);
}
context.SaveChanges();
}
=====================================================================
Here the table is named as DeleteTableTest. Also, I am using EF4 in VS2010. The codes should be similar if you are using VS2008 and EFv1. The only difference would be the Attach method. We need to use this API in EFv1, http://msdn.microsoft.com/en-us/library/system.data.objects.objectcontext.attach.aspx. Additional information about Attaching and Detaching in Entity Framework, http://msdn.microsoft.com/en-us/library/bb896271.aspx。
Note: since you have more than 8 million records in the database, I would recommend you make the above codes to execute in a certain range, e.g. we first handle the records whose ID is smaller than 10000, and then 10000 to 20000, and etc. Also, one drawback here is that EF will delete the records one by one (each records use a single DELETE command), so it may increate the traffics between the client and the database server.
Another workaround would be using SQL statements or stored procedures directly, and it can be done in one database call. The SQL statements can be something like:
=====================================================================
DELETE FROM DeleteTableTest WHERE ID IN (
SELECT
[Extent2].[ID] AS [ID]
FROM [dbo].[DeleteTableTest] AS [Extent1]
INNER JOIN [dbo].[DeleteTableTest] AS [Extent2] ON ([Extent1].[DATE] = [Extent2].[DATE]) AND ([Extent1].[VAL1] = [Extent2].[VAL1]) AND ([Extent1].[VAL2] = [Extent2].[VAL2])
WHERE [Extent1].[ID] = ([Extent2].[ID] - 1))
=====================================================================
If you have any questions, please feel free to let me know.
Have a great day!
Best Regards,
Lingzhi SunMSDN Subscriber Support in Forum
If you have any feedback on our support, please contact msdnmg@microsoft.com
这篇关于使用LINQ和EF清理表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!