customer.pk_name加入transaction.fk_name与customer.pk_id [串行]加入transaction.fk_id [整数] [英] customer.pk_name joining transactions.fk_name vs. customer.pk_id [serial] joining transactions.fk_id [integer]

查看:76
本文介绍了customer.pk_name加入transaction.fk_name与customer.pk_id [串行]加入transaction.fk_id [整数]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当铺应用程序(任何RDBMS):

Pawnshop Application (any RDBMS):

一对多关系,其中每个客户(主)可以进行许多交易(详细信息).

one-to-many relationship where each customer (master) can have many transactions (detail).

customer(
id serial,
pk_name char(30), {PATERNAL-NAME MATERNAL-NAME, FIRST-NAME MIDDLE-NAME-INITIAL}
[...]
);
unique index on id;
unique cluster index on pk_name;


transaction(
fk_name char(30),
tran_type char(1), 
ticket_number serial,
[...]
);
dups cluster index on fk_name;
unique index on ticket_number; 

几个人告诉我,这不是将大师加入细节的正确方法.他们说我应该始终将customer.id [serial]加入Transactions.id [integer].

Several people have told me this is not the correct way to join master to detail. They said I should always join customer.id[serial] to transactions.id[integer].

当客户典当商品时,店员会使用通配符查询主人的姓名.该查询通常返回几个客户,业务员滚动直到找到正确的名称,然后输入"D"以更改为明细交易表,自动查询所有交易,然后业务员输入"A"以添加新交易.

When a customer pawns merchandise, clerk queries the master using wildcards on name. The query usually returns several customers, clerk scrolls until locating the right name, enters a 'D' to change to detail transactions table, all transactions are automatically queried, then clerk enters an 'A' to add a new transaction.

使用customer.id加入transaction.id的问题在于,尽管customer表按已排序的名称顺序进行维护,但按fk_id组将事务表聚类 通过fk_id进行的交易,但它们与客户名称的顺序不同,因此,当业务员在主数据库中滚动浏览客户名称时,系统必须跳过整个位置以查找属于每个客户的集群交易.添加每个新客户时,会将下一个ID分配给该客户,但是新客户不会按字母顺序显示.我使用id连接进行了实验,并确认了性能下降.

The problem with using customer.id joining transaction.id is that although the customer table is maintained in sorted name order, clustering the transaction table by fk_id groups the transactions by fk_id, but they are not in the same order as the customer name, so when clerk is scrolling through customer names in the master, the system has to jump allover the place to locate the clustered transactions belonging to each customer. As each new customer is added, the next id is assigned to that customer, but new customers dont show up in alphabetical order. I experimented using id joins and confirmed the decrease in performance.

使用名称联接与ID联接的缺点是,如果您更改客户名称,则切断了与他们的交易的联接,因此我不允许更新名称.无论如何,一个人需要多久更改一次客户名称?另一个缺点是name需要30个字符,其中id为INT,因此.dat和.idx较大.每天早晨执行sql proc,以按排序的名称顺序卸载客户和交易,删除/重新创建表,加载卸载的数据,并重新创建所有索引,从而使性能保持最佳状态.

The drawbacks of using name joins vs. id joins is if you change customer name, the join with their transactions is severed, so I dont allow updating the name. Anyway, how often does one need to change a customers name? The other draw back is name requires 30 chars where id is INT, so .dat and .idx are larger. Every morning an sql proc is executed which unloads customer and transactions in sorted name order, drops/re-creates the tables, loads the unloaded data and all indexes are re-created which keeps performance optimized.

如果事务没有名称"列,如何使用ID联接而不是名称联接,并仍然按名称保留群集的事务顺序?

How can I use id joins instead of name joins and still preserve the clustered transaction order by name if transactions has no name column?

以下是使用pk/fk名称时数据如何放置在customer.dat和transaction.dat中的示例,如上述架构中所述:

The following is an example of how the data sits in customer.dat and transactions.dat when using pk/fk name, as described in the above schema:

customer.id customer.pk_name               transaction.fk_name            transaction.ticket_number
----------- ------------------------------ ------------------------------ -------------
          2|ACEVEDO BERMUDEZ, FRANCISCO J. ACEVEDO BERMUDEZ, FRANCISCO J.|123456
                                           ACEVEDO BERMUDEZ, FRANCISCO J.|123789

          3|ANDUJAR RODRIGUEZ, WILFREDO C. ANDUJAR RODRIGUEZ, WILFREDO C.|101010
                                           ANDUJAR RODRIGUEZ, WILFREDO C.|121212

          1|CASTILLO DIAZ, FRANKLIN J.     CASTILLO DIAZ, FRANKLIN J.    |232323
                                           CASTILLO DIAZ, FRANKLIN J.    |343434

因此,当业务员按客户主名称进行通配符查询时,当业务员滚动返回返回到当前列表的名称时(由于它们与主名称的排序顺序相同),将自动查询并快速显示客户交易.

So, when clerk wilcard queries by customer master name, customers transactions are automatically queried and quickly displayed when clerk scrolls thru names returned into the current list since they are in the same sorted order as the master.

现在,以下示例是使用pk/fk id的相同数据:

Now, the following example is the same data using pk/fk id:

customer.pk_id customer.name                  transactions.fk_id transactions.ticket_#
-------------- ------------------------------ ------------------ ---------------------
             2|ACEVEDO BERMUDEZ, FRANCISCO J.                  1|232323
                                                               1|343434

             3|ANDUJAR RODRIGUEZ, WILFREDO C.                  2|123456
                                                               2|123789

             1|CASTILLO DIAZ, FRANKLIN J.                      3|101010
                                                               3|121212

好的,因此请记住,我的执行1页屏幕包含所有客户列和所有交易列,并且有一个主/明细指令,当业务员按客户名称查询时,该客户的第一笔交易行自动显示.然后,业务员将按"D"键使交易成为活动表,并按"A"键添加新交易,或者业务员可滚动浏览所有客户交易以特别更新一个交易或仅向客户提供信息.

OK, so now keep in mind that my perform 1-page screen includes all customer columns and all transactions columns, and there's a master/detail instruction which when the clerk queries by customer name, the first transaction row belonging to that customer is automatically displayed. Then the clerk will press 'D' to make transactions the active table and press 'A' to add a new transaction, or clerk may scroll through all the customers transactions to update one in particular or just provide customer with info.

当使用pk/fk名称方法时,当店员滚动浏览客户名称以查找所需客户时,立即做出响应.而使用pk/fk id方法时,即使在支持索引的情况下,响应时间也会滞后,因为当秘书在每个客户名称中滚动时,引擎必须跳至交易表中的不同位置才能找到属于每个客户的相应交易组.在高手!

When using the pk/fk name method, as the clerk scrolls through customer names to locate the desired customer, response is immediate. Whereas when using the pk/fk id method, response time lags, even with supported indexing, because the engine has to jump to different locations in the transactions table to locate the corresponding group of transactions belonging to each customer as clerk scrolls through each customer name in the master!

因此,似乎将客户的交易行分组在一起并以与客户行相同的排序顺序,使索引可以更快地找到交易,而不必跳过每个客户交易的分散组. 如果每个客户都能记住他们的客户i.d.数字,那么我的问题将是学术性的,但在现实世界中,我们甚至给每个客户一个i.d.卡上印有客户编号,但大多数人都丢失了卡!

So, it seems like having the customer's transaction rows grouped together and in the same sorted order as the customer rows allows the indexing to locate the transactions quicker as opposed to having to jump all over scattered groups of each customers transactions. If each customer could remember their customer i.d. number, then my issue would be academic, but in the realworld, we even gave each customer an i.d. card with their customer number on it, but most of them lost their cards!

下面是当铺在营业前每天早上执行的每日重组的一个示例:

Here's an example of the daily reorg executed every morning before pawnshop opens for business:

 {ISQL-SE (customer and transactions table reorg - once-daily, before start of    
  business, procedure}

 unload to "U:\UNL\CUSTOMERS.UNL"
    select * from customer
  order by customer.pk_name; 

 unload to "U:\UNL\TRAN_ACTIVES.UNL" 
    select * from transaction where transaction.status = "A" 
  order by transaction.fk_name, transaction.trx_date; 

 unload to "U:\UNL\TRAN_INACTIVES.UNL" 
    select * from transaction
     where transaction.status != "A" 
       and transaction.trx_date >= (today - 365) 
  order by transaction.fk_name, transaction.trx_date desc; 

 unload to "U:\UNL\TRAN_HISTORIC.UNL" 
    select * from transaction 
     where transaction.status != "A" 
       and transaction.trx_date < (today - 365) 
  order by transaction.trx_date desc; 

 drop table customer;     

 drop table transaction;

 create table customer
 (
  id serial,
  pk_name char(30),
  [...]
 ) 
 in "S:\PAWNSHOP.DBS\CUSTOMER";


 create table transaction
 ( 
  fk_name char(30),
  ticket_number serial,
  tran_type char(1), 
  status char(1), 
  trx_date date, 
  [...]
 )
 in "S:\PAWNSHOP.DBS\TRANSACTION"; 

 load from "U:\UNL\CUSTOMERS.UNL"      insert into customer     {>4800 nrows}
 load from "U:\UNL\TRAN_ACTIVES.UNL"   insert into transaction; {500:600 nrows avg.} 
 load from "U:\UNL\TRAN_INACTIVES.UNL" insert into transaction; {6500:7000 nrows avg.} 
 load from "U:\UNL\TRAN_HISTORIC.UNL"  insert into dss:historic;{>500K nrows} 

 create unique cluster index cust_pk_name_idx on customer(pk_name);
 create        cluster index tran_cust_idx    on transaction(fk_name); 

 {this groups each customers transactions together, actives in 
  oldest trx_date order first, then inactive transactions within the last year in most  
  recent trx_date order. inactives older than 1 year are loaded into historic  
  table in a separate database, on a separate hard disk. historic table  
  optimization is done on a weekly basis for DSS queries.} 

 create unique index tran_ticket_num_idx on transaction(ticket_num); 
 create        index tran_trx_date_idx   on transaction(trx_date); 
 create        index tran_status_idx     on transaction(status); 
 [...;]

 [grant statements...;] 

 update statistics; 

如果您有时间,我会挑战任何人进行测试!..当您拥有一张大桌子时,这一点会更加明显.

If you have time, I CHALLENGE ANYONE TO TEST THIS!.. It's more noticeable when you have a large table.

推荐答案

它们是正确的.加入CHAR(30)文本字段-尤其是其中包含人员姓名数据的文本字段-速度慢,效率低下并且非常脆弱.人们确实会更改姓名(婚姻就是一个明显的例子),并且多个人可以具有相同的名字.

They're right. Joining on a CHAR(30) text field - particularly one containing person name data - will be slow, grossly inefficient and incredibly fragile. People do change their names (marriage is the obvious example), and multiple people can have the same name.

您要在表上创建适当的索引,以支持希望数据出现的顺序,而忘记了群集.您的性能优化过程听起来像一场灾难,寻找一个发生的地方.抱歉,但是像这样删除/创建表会带来麻烦.

You want to create appropriate indexes on your tables to support the order you want data to appear in, and forget clustering. Your performance optimisation procedure sounds like a disaster looking for a place to happen. Sorry, but dropping/creating the tables like that is asking for trouble.

我将从customer.id上的UNIQUE INDEX,transaction.ticket_number上的UNIQUE INDEX开始,以及交易(id,ticket_number DESC)的INDEX(对于性能而不是基数,因此强制唯一性不是很重要),以及从那里拿走.数据从事务表返回的顺序是在索引中出现的顺序.

I would start with a UNIQUE INDEX on customer.id, a UNIQUE INDEX on transaction.ticket_number, and an INDEX (for performance rather than cardinality, so enforcing uniqueness not terribly important) on transactions (id, ticket_number DESC), and take it from there. Data is returned from the transaction table in the order it appears in the index.

我只会在所有其他查询优化途径都用尽时才考虑集群化.

I would only consider clustering when all other avenues of query optimisation have been exhausted.

这篇关于customer.pk_name加入transaction.fk_name与customer.pk_id [串行]加入transaction.fk_id [整数]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆