根据C#列表而不是过滤器表过滤SQL [英] Filter sql based on C# List instead of a filter table

查看:50
本文介绍了根据C#列表而不是过滤器表过滤SQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有一个包含以下数据的表:

Say I have a table with the following data:

现在我要按主键部门和号码进行筛选。我列出了必须在代码中过滤的部门和数字组合。
在我看来,我将创建一个导致以下结果的联接:

Now I want to filter by the primary keys department and number. I have a list of department and number combinations that have to be filtered in code. In my mind, I would create a join that results in the following:

select * from employee e
inner join dynamicTable dyn on e.Department = dyn.Department 
                           and e.Number = dyn.Number;

dynamicTable 是我的 List 具有要过滤的主键,但是我不知道如何将该列表传递到数据库级别。

dynamicTable is my List in C# code that has the primary keys to filter, but I don't know how to pass this list to the database level.

我不想从雇员表中加载所有内容,也不希望通过linq或其他方式在代码中进行过滤,因为我的数据库中有数以百万计的雇员。

I dont't want to load everything from my employees table and that filter in code by linq or something else, because I have millions of employees in my database.

I已经考虑过合并primary_keys并在(...)中创建的位置,但是firebird限制了其中的

I already thought about combining the primary_keys and create a where in (...), but firebird has restriction to max 1500 records in a where in.

使用的数据库是Firebird 2.1版

Database used is Firebird version 2.1

推荐答案

我个人可以看到您可以追求的两个技巧。还有一个来自过去的爆炸。

Personally I can see two tricks you can pursue. And one "blast from the past" more.

路线1。使用GTT:全局临时表

Route #1. Use GTT: GLOBAL TEMPORARY TABLE

GTT是在FB 2.1中引入的(并且您可以使用它),可以按连接或按交易进行。您可能需要每笔交易。这种差异是关于数据(行),架构(结构和索引,元数据)的持久性。请参阅GTT文档中的 ON COMMIT DELETE ROWS 选项。

GTTs were introduced in FB 2.1 (and u use it) and can be per-connection or per-transaction. You would want the per-transaction one. This difference is about the data(rows), the schema(structure and indexes, the meta-data) is persistent. See ON COMMIT DELETE ROWS option in the GTT documentation.

  • https://www.firebirdsql.org/refdocs/langrefupd21-ddl-table.html
  • http://firebirdsql.su/doku.php?id=create_global_temporary_table and www.translate.ru
  • Firebird global temporary table (GTT), touch other tables?

,依此类推。

这样,您打开事务,用列表中的数据填充GTT(将这1500个值对从工作站复制到服务器中),在该GTT上运行查询JOINing,然后 COMMIT 您的事务和表内容将自动删​​除。

In that way, you open the transaction, you fill the GTT with the data from your list (copying those 1500 value-pairs of data from your workstation to the server), you run your query JOINing over that GTT, and then you COMMIT your transaction and the table content is auto-dropped.

如果您可以运行许多almos在会话中进行类似t的查询,那么有意义的是改为进行GTT每次连接,并根据需要修改数据,而不是为每个下一个事务中的每个下一个查询重新填充它,这是一个更合理的选择。复杂的方法。我希望在每次 COMMIT 时都尽早清除,这是我希望使用的默认方法,直到争论为什么在这种特定情况下每次连接会更好。只是不要在查询之间将这些垃圾保留在服务器上。

If you can run many almost-similar queries in the session, then it might make sense to make that GTT per-connection instead and to modify the data as you need, rather than re-fill it for every next query in every next transaction, but it is a more complex approach. Cleanse-early on every COMMIT is what i'd prefer as default approach until argued why per-connection would be better in this specific case. Just not to keep that garbage on the server between queries.

路线2。使用字符串搜索-反转 LIKE 匹配。

Route #2. Use string search - reversed LIKE matching.

此方法的基本形式是搜索一些庞大且任意的整数列表。您的情况有点复杂,您需要匹配成对的数字,而不是单个数字。

In its basic form this method works for searching for some huge and arbitrary list of integer numbers. Your case is a bit more complex, you match against PAIRS of numbers, not single ones.

简单的想法就是这样,假设我们要获取ID为ID的行列可以是1、4、12、24。
简单方法是对每个值进行4个查询,或者对 WHERE ID = 1或ID = 4或... 或使用 WHERE ID IN(1,4,12,24)。在内部, IN 将被展开到非常小的 =或=或= 中,然后很可能作为四个查询执行。对于长列表而言,效率不高。

The simple idea is like that, let's assume we want to fetch rows where ID column can be 1, 4, 12, 24. Straightforward approach would be either making 4 queries for every value, or making WHERE ID = 1 or ID = 4 or ... or using WHERE id IN (1,4,12,24). Internally, IN would be unrolled into that very = or = or = and then most probably executed as four queries. Not very efficient for long lists.

因此,对于匹配很长的列表,我们可以形成一个特殊的字符串。并将其作为文本进行匹配。这使匹配本身的效率大大降低,并禁止使用任何索引,服务器在整个表上运行NATURAL SCAN-但它进行了一次扫描。当匹配列表很大时,单次全表扫描比成千上万的按索引提取效率更高。
但是-仅当列表与表的比例非常大时,才取决于您的特定数据。

So instead - for really long lists to match - we may form a special string. And match it as a text. This makes matching itself much less efficient, and prohibits using any indexing, the server runs a NATURAL SCAN over a whole table - but it makes a one-pass scan. When the matching-list is really large, the one-pass all-table scan gets more efficient than thousands of by-index fetches. BUT - only when the list-to-table ratio is really large, depends on your specific data.

我们使文本列出所有目标值,由AND穿插成定界符:〜1〜4〜12〜24〜。现在,我们在ID列中使用相同的delimiter-number-delimiter字符串,并查看是否可以找到这样的子字符串。

We make the text enlisting all our target values, interspersed by AND wrapped into a delimiter: "~1~4~12~24~". Now we make the same delimiter-number-delimiter string of our ID column and see whether such a substring can be found.

的常用用法LIKE / CONTAINING 将列与以下数据进行匹配: SELECT * from the_table WHERE column_name CONTAINING value_param

我们将其反转, SELECT * from the_table WHERE value_param CONTAINING column_name-based-expression

The usual use of LIKE/CONTAINING is to match a column against data like below: SELECT * from the_table WHERE column_name CONTAINING value_param
We reverse it, SELECT * from the_table WHERE value_param CONTAINING column_name-based-expression

  SELECT * from the_table WHERE '~1~4~12~24~' CONTAINING '~' || ID || '~' 

这是假设ID将自动从整数转换为字符串。如果不是,则必须手动执行: ....包含‘〜’|| CAST(ID as VARCHAR(100))|| '〜'

This assumes ID would get auto-casted from integer to string. IF not you would have to do it manually: .... CONTAINING '~' || CAST( ID as VARCHAR(100) ) || '~'

您的情况有点复杂,您需要匹配Department和Number这两个数字,因此您必须使用如果您采用这种方式,则使用两个不同的定界符。

Your case is a bit more complex, you need to match two numbers, Department and Number, so you would have to use TWO DIFFERENT delimiters, if you follow this way. Something like

SELECT * FROM employee e WHERE
  '~1@10~1@11~2@20~3@7~3@66~' CONTAINING
  '~' || e.Department || '@' || e.Number || '~'

Gotcha:您说目标列表是1500个元素。目标行将是...长。
多长时间?

Gotcha: you say your target list is 1500 elements. The target line would be... long. How exactly long???

Firebird中的VARCHAR受32KB AFAIR限制,并且应将较长的文本作为文本BLOB进行制作,但功能会有所减少。 Like 是否可用于FB2.1中的BLOB?我不记得了,请查看发行说明。还要检查您的库是否甚至允许您将参数类型指定为BLOB而不是字符串。
现在,您的连接字符是什么?如果是Windows-1250或Windows-1251之类的字符,那么一个字符就是一个字节,那么您可以将32K字符放入32KB。但是,如果您的应用程序设置的CONNECTION CHARSET是UTF-8,则每个字母占用4个字节,最大VARCHARable字符串将减少为8K个字母。

VARCHAR in Firebird is limited with 32KB AFAIR, and longer texts should be made as text BLOBs, with reduced functionality. Does LIKE work against BLOBs in FB2.1? I don't remember, check release-notes. Also check if your library would even allow you to specify the parameter type as a BLOB not string. Now, what is your CONNECTION CHARSET? If it would be something like Windows-1250 or Windows-1251 - then one character is one byte, and you can fit 32K characters into 32KBytes. But if the CONNECTION CHARSET your application sets is UTF-8 - then each letter takes 4 bytes and your maximum VARCHARable string gets reduced to 8K letters.

您可以尝试避免为此长字符串使用参数,并将目标字符串常量内联到SQL语句中。但是然后您可能会达到最大SQL语句长度的限制。

You may try to avoid using parameter for this long string and to inline the target string constant into the SQL statement. But then you may hit the limit of maximum SQL statement length instead.

另请参见: MON $ CHARACTER_SET_ID c:\Program Files\Firebird\Firebird_2_1\doc\README.monitoring_tables.txt ,然后是FB文档中的SYSTEM TABLES部分,如何将ID映射到字符集文本名称。

See Also: MON$CHARACTER_SET_ID in c:\Program Files\Firebird\Firebird_2_1\doc\README.monitoring_tables.txt and then SYSTEM TABLES section in the FB docs how to map IDs to charset textual names.

3号路线。可怜的人的GTT。输入伪表。

Route #3 Poor man's GTT. Enter pseudo-tables.

在引入GTT之前,有时可以在较早的IB / FB版本中使用此技巧。

This trick could be used sometimes in older IB/FB versions before GTTs were introduced.

专业版:您不需要更改永久性架构。

缺点:如果不更改SCHEME,则不能创建索引,也不能使用索引加入。同样,您可以达到单个SQL语句的长度限制。

Pro: you do not need to change your persistent SCHEMA.
Con: without changing SCHEME - you can not create indices and can not use indexed joining. And yet again, you can hit the length limit of single SQL statement.

真的,不要认为这适用于您的情况,只是为了使答案完整,我想也应该提到这个技巧。

Really, don't think this would be applicable to your case, just to make the answer complete I think this trick should be mentioned too.

select * from employee e, (
  SELECT 1 as Department, 10 as Number FROM RDB$DATABASE
  UNION ALL SELECT 1, 11 FROM RDB$DATABASE
  UNION ALL SELECT 2, 20 FROM RDB$DATABASE
  UNION ALL SELECT 3, 7 FROM RDB$DATABASE
  UNION ALL SELECT 3, 66 FROM RDB$DATABASE
) t, 
where e.Department = t.Department 
  and e.Number = t.Number

粗鲁而丑陋,但有时此伪表可能会有所帮助。什么时候?在大多数情况下,它有助于从不需要的索引中进行批量INSERT-from-SELECT的操作:-D很少适用于SELECTs-但要知道窍门。

Crude and ugly, but sometimes this pseudo-table might help. When? mostly it helps to make batch INSERT-from-SELECT, where indexing is not needed :-D It is rarely applicable to SELECTs - but just know the trick.

这篇关于根据C#列表而不是过滤器表过滤SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆