在表中获取列的最大值的最快方法是什么? [英] What's the fastest way to obtain the maximum value of a column in a table?

查看:146
本文介绍了在表中获取列的最大值的最快方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

INFORMIX-SQL 4.10.DC1(SE Engine),在DOS 6.22上,在Microsoft Virtual PC 2007下,在Windows 7中!

INFORMIX-SQL 4.10.DC1 (SE Engine), on DOS 6.22, under Microsoft Virtual PC 2007, in Windows 7!

EDIT: Looking for Jonathan Leffler's wisdom on this one!

我有一个名为transaction的子表和一个名为customer的父表。

I have a child table called transaction and a parent table called customer.

这些表由customer.pk_id SERIAL = transaction.fk_id INTEGER连接。

These tables are joined by customer.pk_id SERIAL = transaction.fk_id INTEGER.

事务表在fk_id上有一个聚簇索引,客户交易与匹配索引的数据物理分组在一起。我选择fk_id上的集群索引的原因是因为大多数查询都是对客户名称进行的,并且这快速返回属于被查询客户的所有事务。

The transaction table has a clustered index on fk_id so that each customers transactions are physically grouped together with the data matching the index. The reason I chose the cluster index on fk_id is because most queries are done on customer name and this quickly returns all of the transactions belonging to the queried customer.

事务表还有一个名为transaction_number SERIAL且具有唯一索引的列。

The transaction table also has a column called transaction_number SERIAL with a unique index.

当为任何客户添加新事务时,将在EOF中添加一个具有最高transaction_number的新行。注意:在创建索引后,Informix不会在数据文件中维护物理聚簇。

When a new transaction is added for any customer, a new row is added at EOF with the highest transaction_number.. Note: Informix does not maintain the physical clustering in the datafile after the index is created.

因此,当我要打印输入的最后一个事务的客户收据时,以下查询:

So when I want to print a customer receipt of the last transaction entered I do the following query:

SELECT * 
  FROM customer c, 
       transaction t
 WHERE c.pk_id = t.fk_id
   AND t.trx_num = (SELECT MAX(trx_num)
                      FROM transaction)


b $ b

编辑:以下INDEXES存在:

The following INDEXES exist:

UNIQUE INDEX ON customer(pk_id) {SERIAL}
UNIQUE CLUSTER INDEX ON customer(last_name,sur_name,first_name,middle_name) {CHAR(30's)}
UNIQUE INDEX ON customer(ident_type,ident_num,ident_state,ident_country) {CHARS(n)}

UNIQUE CLUSTER INDEX ON transaction(fk_id) {INT}
UNIQUE INDEX ON transaction(trx_num) {SERIAL}


b $ b

EDIT:上述查询的EXPLAIN结果:

EXPLAIN results for the above query:

QUERY:
------
SELECT *
  FROM customer c,
       transaction t,
 WHERE c.pk_id   = t.fk_id           
   AND t.trx_num = (SELECT MAX(trx_num) 
                      FROM transaction)

Estimated Cost: 14
Estimated # of Rows Returned: 2

1) f.transaction: INDEX PATH

    (1) Index Keys: trx_num 
        Lower Index Filter: f.transaction.trx_num = <subquery> 

2) f.customer: INDEX PATH

    (1) Index Keys: pk_id 
        Lower Index Filter: f.customer.pk_id = f.transaction.fk_id 

    Subquery:
    ---------
    Estimated Cost: 3
    Estimated # of Rows Returned: 1

    1) f.trx_num: INDEX PATH

        (1) Index Keys: trx_num 

看起来查询优化器正在对事务执行全表扫描,以便定位最高事务数,其新行始终放在EOF,但由于事务按fk_id分组,因此其余事务数分散

It seems like the query optimizer is doing a full table scan on transaction in order to locate the highest transaction number, whose new row is always placed at EOF , but since transactions are grouped by fk_id, the rest of the transaction numbers are scattered throughout the table.

有更好的方法来获得更快的查询响应时间吗?

Is there a better way for obtaining a quicker response time on the query?

CREATE UNIQUE INDEX trxnumidx ON事务(transaction_number)DESCENDING 有助于快速定位MAX transaction_number?

Would CREATE UNIQUE INDEX trxnumidx ON transaction(transaction_number) DESCENDING help in quickly locating the MAX transaction_number?

编辑:我有一个类似的查询,可以快速检索所需的交易和客户信息,以便用户可以打印最后输入的交易或任何先前输入的交易,但它需要用户输入交易号:

I do have a similar query which quickly retrieves the desired transaction and customer information so that the user can print the last entered transaction, or any previously entered transaction, but it requires the user to input the transaction number:

SELECT * 
      FROM customer c, 
           transaction t
     WHERE c.pk_id = t.fk_id
       AND t.trx_num = $trxnum  {ace input variable, type INT}

$对于这个查询,用户手动输入交易号,这是什么阻碍我的是检索是瞬时的(总成本= 2),其中当自动与MAX(总成本= 14)我选择使这个查询自动的原因是,在过去,当用户手动输入交易编号时,有时他们无意中输入了错误的交易编号,这是一个有效的编号,没有意识到,他们签名并给予

What baffles me about this query, where the user manually inputs the transaction number, is that retrieval is instantaneous (total cost = 2), where as when automatic with MAX (total cost = 14).. The reason why I chose to make this query automatic is that in the past, when users manually input the transaction number, sometimes they inadvertently entered the wrong transaction number which happened to be a valid number and without realizing it, they signed and gave the customer a receipt with the wrong information on it!

编辑: DBINFO('sqlca.sqlerrd1')

推荐答案

因为你想要一个<$>的最大交易号c $ c> fk_id ,我想我会期望创建一个(可选UNIQUE)索引:

Since you are wanting the maximum transaction number for a fk_id, I think I would expect to create an (optionally UNIQUE) index:

CREATE {UNIQUE} INDEX ix_lookup1 ON Transaction(FK_ID, Transaction_Number);

索引可以是唯一的,因为事务数字本身是唯一的,查找给定的 fk_id 值。

The index can be unique since transaction number alone is unique, but the point of this index is to speed up lookups for a given fk_id value.

然后我希望使用一个查询, href =http://stackoverflow.com/users/644450/oers> oers 建议:

I would then expect to use a query like the one that oers suggested:

SELECT * 
  FROM customer AS c, transaction AS t
 WHERE c.pk_id = t.fk_id
   AND c.pk_id = 12345678         -- crucial to limit the data to one return row
   AND t.transaction_number =
          (SELECT MAX(transaction_number)
             FROM transaction AS t2
            WHERE t2.fk_id = c.pk_id)

这仍然是一个相关的子查询,但是考虑到使用客户编号,只有一行数据要返回,它是优雅的这是这种情况。根据我使用的编程语言,我可能安排使用子查询:

This is still a correlated sub-query, but given that the customer number is used, there will only be one row of data to return and it is blatant to the optimizer that this is the case. Depending on the programming language I was using, I might well arrange for the sub-query to use:

SELECT * 
  FROM customer AS c, transaction AS t
 WHERE c.pk_id = t.fk_id
   AND c.pk_id = 1234567
   AND t.transaction_number =
          (SELECT MAX(transaction_number)
             FROM transaction AS t2
            WHERE t2.fk_id = 1234567)

SELECT * 
  FROM customer AS c, transaction AS t
 WHERE c.pk_id = t.fk_id
   AND c.pk_id = 1234567
   AND t.transaction_number =
          (SELECT MAX(transaction_number)
             FROM transaction AS t2
            WHERE t2.fk_id = 1234567)

您可以使用占位符或从字符串中将SQL创建为suits。

You could use placeholders or create the SQL from a string as suits.

更好的是,我通过在插入之后立即捕获插入的序列值来避免这个问题,以便(a)如果程序已经去了多用户,它仍然会工作,和(b)整个MAX问题将被避免,并且查询可以变为:

Even better, I'd avoid the problem by capturing the inserted serial value immediately after the insert so that (a) if the program ever went multi-user, it would still work and (b) the whole MAX issue would be avoided, and the query could become:

SELECT * 
  FROM customer AS c, transaction AS t
 WHERE c.pk_id = t.fk_id
   AND c.pk_id = 1234567
   AND t.transaction_number = 23456789;

请注意,使用 c.pk_id = 1234567 为客户指定为参数的查询,您甚至可能不需要我建议的索引。无论我在哪里使用数字,在执行查询时为我们提供的值都可以使用占位符。

Note that with the c.pk_id = 1234567 for the customer specified as a parameter to the query, you might not even need the index I suggested. And wherever I used a number, it would be fine with me to use a placeholder for the value to be supplied when the query is executed.

请记住,您使用的是软件这是足够老,喝酒,更不用说投票;优化器不如在更现代的DBMS中找到的优化器。

Remember, you are using software that is old enough to drink, let alone vote; the optimizer is not as good as the one you'd find in a more modern DBMS.

这篇关于在表中获取列的最大值的最快方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆