C# 和 SQLServer 规范化大量 Url [英] C# and SQLServer normalizing large sets of Urls

查看:48
本文介绍了C# 和 SQLServer 规范化大量 Url的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在数据库中有许多表,其中至少有一列包含 Url.这些在整个数据库中重复了很多次.所以我将它们规范化为一个专用表,我只在需要它们的地方使用数字 ID.我经常需要加入它们,因此数字 ID 比完整字符串要好得多.

I have many tables in the database that have at least one column that contains a Url. And these are repeated a lot through-out the database. So I normalize them to a dedicated table and I just use numeric IDs everywhere I need them. I often need to join them so numeric ids are much better than full strings.

MySql + C++中,为了一击插入很多Url,我以前使用多行INSERT IGNOREsmysql_set_local_infile_handler().然后用 IN () 批处理 SELECT 以从数据库中取回 ID.

In MySql + C++, to insert a lot of Urls in one strike, I used to use multi-row INSERT IGNOREs or mysql_set_local_infile_handler(). Then batch SELECT with IN () to pull the IDs back from the database.

C# + SQLServer 中,我注意到有一个 SqlBulkCopy 类,它在批量插入中非常有用且快速.但是我还需要在插入 URL ID 后进行批量选择来解析它们.是否有任何这样的辅助类可以与 SELECT WHERE IN (many, urls, here) 一样工作?

In C# + SQLServer I noticed there's a SqlBulkCopy class that's very useful and fast in mass-insertion. But I also need mass-selection to resolve the Url IDs after I insert them. Is there any such helper class that would work the same as SELECT WHERE IN (many, urls, here)?

或者你有更好的想法在 C# 中以一致的方式将 Url 转换为数字吗?我想过 crc32 将 url 或 crc64'ing 他们,但我担心冲突.如果碰撞很少,我不在乎,但如果没有……那将是一个问题.

Or do you have a better idea for turning Urls into numbers in a consistent manner in C#? I thought about crc32'ing the urls or crc64'ing them but I worry about collisions. I wouldn't care if collisions are few, but if not... it would be an issue.

PS:我们讨论的是数千万个 URL 以了解规模.

PS: We're talking about tens of millions of Urls to get an idea of scale.

PS:对于基本的大型插入,SQLBulkCopySqlDbType.Structured 更快.此外,它还具有用于状态跟踪回调的 SqlRowsCopied 事件.

PS: For basic large insert, SQLBulkCopy is faster than SqlDbType.Structured. Plus it has the SqlRowsCopied event for a status tracking callback.

推荐答案

还有比 SQLBulkCopy 更好的方法.

There is even a better way than SQLBulkCopy.

它被称为结构化参数和它允许您通过 ADO.NET 将表值参数传递给存储过程或查询.

It's called Structured Parameters and it allows you to pass a table-valued parameter to stored procedure or query through ADO.NET.

文章中有代码示例,所以我只会强调你需要做的事情来启动和工作:

There are code examples in the article, so I will only highlight what you need to do to get it up and working:

  1. 在数据库中创建用户定义的表类型.你可以称之为UrlTable
  2. 设置一个 SP 或查询,通过加入一个表变量或类型 UrlTable
  3. 在您的支持代码 (C#) 中,创建一个与 UrlTable 具有相同结构的 DataTable,使用 URL 填充它并将其传递给 SqlCommandcode> through 作为结构化参数.请注意,数据表和表类型之间的列顺序对应关系至关重要.
  1. Create a user defined table type in the database. You can call it UrlTable
  2. Setup a SP or query which does the SELECT by joining with a table variable or type UrlTable
  3. In your backing code (C#), create a DataTable with the same structure as UrlTable, populate it with URLs and pass it to an SqlCommand through as a structured parameter. Note that column order correspondence is critical between the data table and the table type.

ADO.NET 在幕后所做的(如果您分析查询,您可以看到这一点)是在查询之前它声明了一个 UrlTable 类型的变量,并用什么填充它(INSERT 语句)您传入结构化参数.

What ADO.NET does behind the scenes (if you profile the query you can see this) is that before the query it declares a variable of type UrlTable and populates it (INSERT statements) with what you pass in the structured parameter.

除此之外,在查询方面,您几乎可以使用 SQL 中的表值参数(连接、选择等)执行所有操作.

Other than that, query-wise, you can do pretty much everything with table-valued parameters in SQL (join, select, etc).

这篇关于C# 和 SQLServer 规范化大量 Url的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆