更新〜5000列 [英] updating ~5000 columns

查看:59
本文介绍了更新〜5000列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以在sql server中更新~5000列/秒(使用分布在多个表中的宽表或列);如果是,那么我应该采用什么方法?



提前谢谢

Is it possible to update ~5000 columns/second in sql server(either using Wide tables or columns spread across multiple tables); If yes then what approach should i follow?

Thanks in advance

推荐答案

这有点难以回答那个问题 - 如果所有列都在一个表中,那么它只是一个更新语句,速度只是数据库服务器硬件性能的一个因素。



如果列分布在多个表中,我希望使用执行更新工作的存储过程,并从应用程序接收所有这些参数。但是,您可能无法将5000个参数传递给存储过程。所以,现在你必须将命令分解为多个(甚至100个或更多)不同的调用。应用程序和网络上的重量级,但不会太大,以至于工作时间超过1秒。



现在,您必须查看100多个命令,每个都将相关数据传递给数据库 - 什么是最好的架构。您可能需要确保将接收该数据的行不会向读取该行/那些行的应用程序显示部分/不完整的图片。



您可以在事务中运行命令,并让读取该行的应用程序使用安全隔离级别。



我想要什么但是,要创建一个临时保存应用程序数据的工作表; 100个命令可以插入到此表中,而不会影响数据库其余部分的状态。插入的每一行都将具有应用程序生成的伪事务ID。当传输所有数据时,最终命令调用存储过程,可能只获取一个参数(伪事务ID),可能更多(目标行的行标识符)。该存储过程开始数据库事务,将数据从保存表移动到其最终目标,从保留表中删除数据并提交事务。



此方法从数据库事务中的已用时间中删除应用程序/网络往返时间。您仍然需要为数据的使用者使用适当的隔离级别,但性能会更好。



从不同的方向来解决问题......

要插入单行(或一组相关行)的5000个值非常不寻常。现在,也许你的应用程序就是那么不寻常,它应该是正确的。但是,也许有一些方法可以重新组织您的数据流或数据存储架构,这只会让问题消失。



如果您收到稳定的数据流,那么批量批量处理5k项可能是不必要的。



存储方式;最好将5k值存储为表中的5k行,其中包含以下列:ParentEntity(拥有对象的外键),Name,Value。现在,你只需要一个超宽表(无论如何都不是一个好主意)你可以用较小的块将数据发送到数据库。



可能有一个中间数据架构 - 也许你的5k值可以有意义地分组成相关数据的子集,这些子集可以(也许应该)成组地发送到数据库。



HTH,

Chris
It's kinda hard to answer that question - if all the columns are in one table, then it's just a single update statement and the speed is solely a factor of the hardware performance of your databse server.

If the columns are spread across a number of tables, I'd expect to use a stored procedure that did the update work, and received all those parameters from the application. But, probably, you can't pass 5000 parameters to a stored procedure. So, now you're having to break the command up into multiple (maybe even 100 or more) distinct invocations. Heavy-ish on the application and the network, but not so heavy as to push the work over 1 second.

Now, you have to look at 100+ commands, each passing related data to the database - what's the best architecture. You probably need to ensure that the row(s) that will receive that data don't present a partial/incomplete picture to the app(s) that read that/those rows.

You could run the commands in a transaction, and have the application(s) that read the row use a safe isolation level.

What I'd do, however, is create a working table that will temporarily hold the data from the application; 100 commands can insert into this table without affecting the state of the rest of the database. Each row inserted will have an application-generated pseudo-transaction id. When all data is transferred, a final command invokes a stored procedure, taking perhaps just one parameter (the pseudo-transaction ID), perhaps more (row identifier(s) for the destination rows). That stored procedure begins a database transaction, moves the data from the holding table into its final destination, deletes the data from the holding table and commits the transaction.

This approach removes the application/network round-trip time from the elapsed time within the database transaction. You'll still need to use an appropriate isolation level for the consumers of the data, but performance will be better.

APPROACHING THE PROBLEM FROM A DIFFERENT DIRECTION...
5000 values to insert into a single row (or set of related rows) is VERY unusual. Now, perhaps your application is just that unusual and its exactly the right thing to do. Perhaps, however, there are ways of reorganising your data flow, or data storage architecture, that simply makes the problem go away.

If you're receiving a steady flow of data, then batching it into 5k items in a batch is perhaps unnecessary.

Storage-wise; it may be better to store the 5k values as 5k rows in a table with columns such as: ParentEntity (a foreign key to the owning object), Name, Value. Now, you just don't need a super-wide table (rarely a good idea, anyway) and you can send the data to the database in smaller chunks.

There may be an intermediate data architecture - maybe your 5k values can meaningfully be grouped into subsets of related data which can (and perhaps, should) be sent to the database in groups.

HTH,
Chris


您可以使用sqlite执行此操作。
You can use sqlite to perform this operation.


这篇关于更新〜5000列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆