可以使用表参数插入,还可以检索标识值吗? [英] Possible to insert with a Table Parameter, and also retrieve identity values?

查看:44
本文介绍了可以使用表参数插入,还可以检索标识值吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用高性能表参数方法插入记录(http://www.altdevblogaday.com/2012/05/16/sql-server-high-performance-inserts/ ),我很好奇是否有可能取回我插入的每条记录的标识值.

目前,答案似乎是否定的 - 我插入数据,然后检索身份值,但它们不匹配.具体来说,它们大约有 75% 的时间不匹配,并且它们以不可预测的方式不匹配.下面是一些复制此问题的代码:

//创建一个 10 万行的数据表数据表 dt = 新数据表();dt.Columns.Add(new DataColumn("item_id", typeof(int)));dt.Columns.Add(new DataColumn("comment", typeof(string)));for (int i = 0; i <100000; i++) {dt.Rows.Add(new object[] { 0, i.ToString() });}//插入这些记录并取回身份使用 (SqlConnection conn = new SqlConnection("Data Source=localhost;Initial Catalog=testdb;Integrated Security=True")) {conn.Open();使用 (SqlCommand cmd = new SqlCommand("proc_bulk_insert_test", conn)) {cmd.CommandType = CommandType.StoredProcedure;//添加一个结构化"参数可以让你以低开销插入大量数据SqlParameter param = new SqlParameter("@mytable", SqlDbType.Structured);param.Value = dt;cmd.Parameters.Add(param);SqlDataReader dr = cmd.ExecuteReader();//设置所有记录的标识值int i = 0;而 (dr.Read()) {dt.Rows[i].ItemArray = new object[] { dr.GetInt32(0), dt.Rows[i].ItemArray[1] };我++;}博士关闭();}//所有记录的 ID 号是否与我从数据库中收到的相匹配?使用 (SqlCommand cmd = new SqlCommand("SELECT * FROM bulk_insert_test WHERE item_id >= @base_identity ORDER BY item_id ASC", conn)) {cmd.Parameters.AddWithValue("@base_identity", (int)dt.Rows[0].ItemArray[0]);SqlDataReader dr = cmd.ExecuteReader();DataTable dtresult = new DataTable();dtresult.Load(dr);}}

使用此 SQL 服务器脚本定义数据库:

创建表bulk_insert_test (item_id int IDENTITY (1, 1) NOT NULL PRIMARY KEY,注释 varchar(20))去CREATE TYPE bulk_insert_table_type AS TABLE ( item_id int, comment varchar(20) )去创建程序 proc_bulk_insert_test@mytable bulk_insert_table_type 只读作为声明@TableOfIdentities 表(IdentValue INT)INSERT INTO bulk_insert_test(评论)OUTPUT Inserted.item_id INTO @TableOfIdentities(IdentValue)从@mytable 中选择评论SELECT * FROM @TableOfIdentities

问题在于:从 proc_bulk_insert_test 返回的值与插入原始记录的顺序不同.因此,我无法以编程方式为每条记录分配从 OUTPUT 语句返回的 item_id 值.

似乎唯一有效的解决方案是 SELECT 返回我刚刚插入的整个记录​​列表,但坦率地说,我更喜欢任何可以减少通过我的 SQL Server 传输的数据量的解决方案网卡.有没有人有更好的解决方案来处理大插入,同时仍然检索标识值?

让我试着进一步澄清这个问题.问题是我希望我的 C# 程序了解 SQL Server 分配给我刚刚插入的数据的标识值.顺序不是必需的;但我希望能够在 C# 中获取任意一组记录,使用快速表参数方法插入它们,然后在 C# 中分配它们自动生成的 ID 号,而不必将整个表重新查询回内存.

鉴于这是一个人工测试集,我试图将其压缩为尽可能小的可读代码.让我描述一下我使用了哪些方法来解决这个问题:

  1. 在我的原始代码中,在这个示例来自的应用程序中,我将使用 1500 万条单独的插入语句插入大约 1500 万行,并在每次插入后检索标识值.这有效但很慢.
  2. 我使用高性能表参数修改了代码以进行插入.然后我会在 C# 中处理所有对象,并从数据库中读回整个对象.但是,原始记录有几十列,其中包含大量 varchar 和十进制值,因此这种方法虽然速度快且有效,但网络流量非常大.
  3. 我现在开始研究是否可以使用表参数插入,同时要求 SQL Server 只报告标识值.我尝试了 scope_identity()OUTPUT 但到目前为止都没有成功.

基本上,如果 SQL Server 始终按照我提供的顺序插入记录,这个问题就会得到解决.是否可以让 SQL Server 完全按照表值参数插入中提供的顺序插入记录?

这种方法似乎与 Cade Roux 在下面引用的非常相似:

http://www.sqlteam.com/article/using-the-output-clause-to-capture-identity-values-on-multi-row-inserts

但是,在文章中,作者使用了一个神奇的唯一值ProductNumber",将插入的信息从输出"值连接到原始表值参数.如果我的表没有神奇的唯一值,我正在尝试弄清楚如何执行此操作.

解决方案

你的 TVP 是一个无序集合,就像一个普通的表.它只有在您指定时才有顺序.您不仅无法在此处指示实际顺序,而且您还只是在最后执行 SELECT * 而没有 ORDER BY.你期望什么顺序?您已经有效地告诉 SQL Server 您不在乎.也就是说,我实现了你的代码,并且按正确的顺序恢复行没有问题.我稍微修改了程序,这样你就可以真正知道哪个标识值属于哪个评论:

DECLARE @TableOfIdentities TABLE (IdentValue INT, 注释 varchar(20))INSERT INTO bulk_insert_test(评论)OUTPUT Inserted.item_id, Inserted.commentINTO @TableOfIdentities(IdentValue, 评论)从@mytable 中选择评论SELECT * FROM @TableOfIdentities

然后我使用这个代码调用它(我们不需要所有的 C#):

DECLARE @t bulk_insert_table_type;INSERT @t VALUES(5,'foo'),(2,'bar'),(3,'zzz');SELECT * FROM @t;执行 dbo.proc_bulk_insert_test @t;

结果:

1 foo2 巴3 zzz

如果您想确保输出是按照身份分配的顺序(不一定与您的无序 TVP 具有相同的顺序"),您可以将 ORDER BY item_id 添加到程序中的最后一个选择.

如果您想插入到目标表中,以便您的身份值按对您很重要的顺序排列,那么您有两个选择:

  • 向您的 TVP 添加一列并将订单插入该列,然后使用游标按该顺序遍历行,并一次插入一个.仍然比为每一行调用整个过程更有效,恕我直言.

  • 在您的 TVP 中添加一列指示顺序,并在插入时使用 ORDER BY.这不能保证,但相对可靠,特别是如果您使用 MAXDOP 1 消除并行性问题.

在任何情况下,您似乎都非常重视 ORDER.您的订单实际上是什么意思?如果您想为订单赋予某种意义,则不应使用 IDENTITY 列.

I'm trying to insert records using a high performance table parameter method ( http://www.altdevblogaday.com/2012/05/16/sql-server-high-performance-inserts/ ), and I'm curious if it's possible to retrieve back the identity values for each record I insert.

At the moment, the answer appears to be no - I insert the data, then retrieve back the identity values, and they don't match. Specifically, they don't match about 75% of the time, and they don't match in unpredictable ways. Here's some code that replicates this issue:

// Create a datatable with 100k rows
DataTable dt = new DataTable();
dt.Columns.Add(new DataColumn("item_id", typeof(int)));
dt.Columns.Add(new DataColumn("comment", typeof(string)));
for (int i = 0; i < 100000; i++) {
    dt.Rows.Add(new object[] { 0, i.ToString() });
}

// Insert these records and retrieve back the identity
using (SqlConnection conn = new SqlConnection("Data Source=localhost;Initial Catalog=testdb;Integrated Security=True")) {
    conn.Open();
    using (SqlCommand cmd = new SqlCommand("proc_bulk_insert_test", conn)) {
        cmd.CommandType = CommandType.StoredProcedure;

        // Adding a "structured" parameter allows you to insert tons of data with low overhead
        SqlParameter param = new SqlParameter("@mytable", SqlDbType.Structured);
        param.Value = dt;
        cmd.Parameters.Add(param);
        SqlDataReader dr = cmd.ExecuteReader();

        // Set all the records' identity values
        int i = 0;
        while (dr.Read()) {
            dt.Rows[i].ItemArray = new object[] { dr.GetInt32(0), dt.Rows[i].ItemArray[1] };
            i++;
        }
        dr.Close();
    }

    // Do all the records' ID numbers match what I received back from the database?
    using (SqlCommand cmd = new SqlCommand("SELECT * FROM bulk_insert_test WHERE item_id >= @base_identity ORDER BY item_id ASC", conn)) {
        cmd.Parameters.AddWithValue("@base_identity", (int)dt.Rows[0].ItemArray[0]);
        SqlDataReader dr = cmd.ExecuteReader();
        DataTable dtresult = new DataTable();
        dtresult.Load(dr);
    }
}

The database is defined using this SQL server script:

CREATE TABLE bulk_insert_test (
    item_id int IDENTITY (1, 1) NOT NULL PRIMARY KEY,
    comment varchar(20)
)
GO

CREATE TYPE bulk_insert_table_type AS TABLE ( item_id int, comment varchar(20) )
GO

CREATE PROCEDURE proc_bulk_insert_test
    @mytable bulk_insert_table_type READONLY
AS

DECLARE @TableOfIdentities TABLE (IdentValue INT)

INSERT INTO bulk_insert_test (comment)
OUTPUT Inserted.item_id INTO @TableOfIdentities(IdentValue)
SELECT comment FROM @mytable

SELECT * FROM @TableOfIdentities

Here's the problem: the values returned from proc_bulk_insert_test are not in the same order as the original records were inserted. Therefore, I can't programmatically assign each record the item_id value I received back from the OUTPUT statement.

It seems like the only valid solution is to SELECT back the entire list of records I just inserted, but frankly I'd prefer any solution that would reduce the amount of data piped across my SQL Server's network card. Does anyone have better solutions for large inserts while still retrieving identity values?

EDIT: Let me try clarifying the question a bit more. The problem is that I would like my C# program to learn what identity values SQL Server assigned to the data that I just inserted. The order isn't essential; but I would like to be able to take an arbitrary set of records within C#, insert them using the fast table parameter method, and then assign their auto-generated ID numbers in C# without having to requery the entire table back into memory.

Given that this is an artificial test set, I attempted to condense it into as small of a readable bit of code as possible. Let me describe what methods I have used to resolve this issue:

  1. In my original code, in the application this example came from, I would insert about 15 million rows using 15 million individual insert statements, retrieving back the identity value after each insert. This worked but was slow.
  2. I revised the code using high performance table parameters for insertion. I would then dispose of all of the objects in C#, and read back from the database the entire objects. However, the original records had dozens of columns with lots of varchar and decimal values, so this method was very network traffic intensive, although it was fast and it worked.
  3. I now began research to figure out whether it was possible to use the table parameter insert, while asking SQL Server to just report back the identity values. I tried scope_identity() and OUTPUT but haven't been successful so far on either.

Basically, this problem would be solved if SQL Server would always insert the records in exactly the order I provided them. Is it possible to make SQL server insert records in exactly the order they are provided in a table value parameter insert?

EDIT2: This approach seems very similar to what Cade Roux cites below:

http://www.sqlteam.com/article/using-the-output-clause-to-capture-identity-values-on-multi-row-inserts

However, in the article, the author uses a magic unique value, "ProductNumber", to connect the inserted information from the "output" value to the original table value parameter. I'm trying to figure out how to do this if my table doesn't have a magic unique value.

解决方案

Your TVP is an unordered set, just like a regular table. It only has order when you specify as such. Not only do you not have any way to indicate actual order here, you're also just doing a SELECT * at the end with no ORDER BY. What order do you expect here? You've told SQL Server, effectively, that you don't care. That said, I implemented your code and had no problems getting the rows back in the right order. I modified the procedure slightly so that you can actually tell which identity value belongs to which comment:

DECLARE @TableOfIdentities TABLE (IdentValue INT, comment varchar(20))

INSERT INTO bulk_insert_test (comment)
OUTPUT Inserted.item_id, Inserted.comment 
INTO @TableOfIdentities(IdentValue, comment)
SELECT comment FROM @mytable

SELECT * FROM @TableOfIdentities

Then I called it using this code (we don't need all the C# for this):

DECLARE @t bulk_insert_table_type;
INSERT @t VALUES(5,'foo'),(2,'bar'),(3,'zzz');
SELECT * FROM @t;

EXEC dbo.proc_bulk_insert_test @t;

Results:

1   foo
2   bar
3   zzz

If you want to make sure the output is in the order of identity assignment (which isn't necessarily the same "order" that your unordered TVP has), you can add ORDER BY item_id to the last select in your procedure.

If you want to insert into the destination table so that your identity values are in an order that is important to you, then you have a couple of options:

  • add a column to your TVP and insert the order into that column, then use a cursor to iterate over the rows in that order, and insert one at a time. Still more efficient than calling the entire procedure for each row, IMHO.

  • add a column to your TVP that indicates order, and use an ORDER BY on the insert. This isn't guaranteed, but is relatively reliable, particularly if you eliminate parallelism issues using MAXDOP 1.

In any case, you seem to be placing a lot of relevance on ORDER. What does your order actually mean? If you want to place some meaning on order, you shouldn't be doing so using an IDENTITY column.

这篇关于可以使用表参数插入,还可以检索标识值吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆