JOINS vs. while语句 [英] JOINS vs. while statements

查看:108
本文介绍了JOINS vs. while语句的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我上班的公司中,他们运行一个PHP/MySQL关系数据库.我一直以为,如果我需要从不同的表中提取不同的信息,那我可以做一个简单的连接来提取诸如...的数据.

In the company where I came to work, they run a PHP/MySQL relational database. I had always thought that if I needed to pull different info from different tables, that I could just do a simple join to pull in the data such as....

SELECT table_1.id, table_2.id FROM table_1 LEFT JOIN table_2 ON table_1.sub_id = table_2.id

当我到达目前工作的地方时,这就是他们的工作.

When I got to where I currently work, this is what they do.

<?php $query = mysql_query("SELECT sub_id FROM table_1");
while($rs = mysql_fetch_assoc($query)) {
    $query_2 = mysql_fetch_assoc(mysql_query("SELECT * FROM table_2 WHERE id = '{$rs['sub_id']}'"));
    //blah blah blah more queries
?>

当我问为什么要采用第二种方法时,他们说它实际上比联接要快.他们管理着一个数据库,该数据库在不同的表上有数百万条记录,并且某些表有些宽(按行).他们说,如果执行不当的查询可能会锁定一个表(或其中的几个表),则他们希望避免联接.要记住的另一件事是,此数据库上附加了一个庞大的报表生成器,客户可以使用该报表生成器来构建自己的报表,如果他们发疯并生成大型报表,则可能会造成严重破坏.

When I asked why the did it the second way, they said that it actually ran faster than a join. They manage a database that has millions of records on different tables and some of the tables are a little wide (row-wise). They said that they wanted to avoid joins in the case that a poorly executed query could lock up a table (or several of them). One other thing to keep in mind is that there is a massive report builder attached to this database that a client can use to build their own report and if they go crazy and build a big report, it could cause some havoc.

我很困惑,所以我想把这个扔给普通编程人员.这可能是一个见解,但是执行while语句(如果需要的话,可以执行一个较大的查询以拉很多行,然后进行很多小小的子查询)或执行联接(拉动)是否真的更快?一次进行较大的查询以获取所需的所有数据).只要索引正确完成,这有关系吗?要考虑的另一件事是,当前数据库为InnoDB格式.

I was confused so I thought I'd throw this out there for the general programming public. This could be a matter of opinion, but is it really faster to do the while statement (one larger query to pull a lot of rows, followed by a lot of small tiny sub-queries if you will) or to do a join (pull a larger query one time to get all the data you need). As long as indexes are done properly, does it matter? One other thing to consider is that the current DB is in InnoDB format.

谢谢!

更新8/28/14

所以我想我会对此进行更新,并且可以长期使用.讨论之后,我决定在这里重建报告生成器.我没有确切的结果数字,但我想我会分享结果是什么.

So I thought I'd throw up an update to this one and what has worked more long term. After this discussion I decided to rebuild the report generator here at work. I don't have definitive result numbers, but I thought I'd share what the result was.

我认为这有点过头了,因为我将整个报告(就返回的数据而言,这是相当动态的)变成了一个庞大的联接盛宴.大多数联接(如果不是全部)都是将值联接到主键,因此它们的运行速度都非常快.如果该报告说要提取30列数据,并且提取了2000条记录,则每个字段都在运行查询以获取数据(因为那条数据可能位于不同的字段中). 30 x 2000 = 60000,甚至在每次查询0.0003秒的最佳查询时间下,仍然仅是查询时间的18秒(这我几乎记得了).现在,我将查询重建为大量主键上的大量联接(如果可能),该报告在大约2-3秒内加载,并且大部分时间都在下载html.根据需要的数据,每个返回的记录将在0-4个额外查询之间运行(如果可以在联接中获取数据,则可能不需要任何数据,这种情况发生的时间为75%).因此,相同的2000条记录将返回额外的0-8000条查询(比60000条要好得多).

I think went a little overkill because I turned the entire report (it's pretty dynamic as far as the data that's returned) into a massive join fest. Most of the joins, if not all are joining a value to a primary key so they all run really really fast. If the report had lets say 30 columns of data to pull and it pulled 2000 records, every single field was running a query to fetch the data (because that piece of data could be on a different field). 30 x 2000 = 60000 and even under a sweet query time of 0.0003 seconds per query, that was still 18 seconds of just query time (which is pretty much what I remember it being). Now that I rebuilt the query as a massive join on a bunch of primary keys (where possible), that same report loaded in about 2-3 seconds, and most of that time was downloading the html. Each record that returns runs between 0-4 extra queries depending on the data that's needed (may not need any data if it can fetch it in the joins, which happens 75% of the time). So the same 2000 records would return an additional 0-8000 queries, (much better than 60000).

我想说,while语句在某些情况下很有用,但是正如下面的注释中所述,基准测试就是它的全部内容.就我而言,联接是更好的选择,但是在我网站的其他区域,使用while语句更为有用.在一个实例中,我有一个报告,其中客户可以请求多个类别进行拉取,并且仅返回这些类别的数据.发生了什么事,我有一个ID为50-500的category_id IN(...,...,..,.., etc etc etc),当我在最后关头握住它时,索引会cho住并死掉.因此,我的工作是将ID分成10个组,并执行相同的查询x/10次,因此我的结果获取 way 的速度比以前快,因为索引喜欢处理10个ID,而不是500个,因此,通过执行while语句,我的查询得到了很大的改进.

I would say that the while statement is useful in some cases, but as stated below in the comments, benchmarking is what it's all about. In my case, joins were the better option, but in other areas of my site, a while statement is more useful. In one instance I have a report where a client could request several categories to pull by and only return data for those categories. What happened was I had a category_id IN(...,...,..,.., etc etc etc) with 50-500 IDs and the index would choke and die in my arms as I was holding it in it's final moments. So what I did was spread out the ids in groups of 10 and ran the same query x / 10 times and my results were fetch way faster than before because the index likes dealing with 10 IDs, not 500, so I saw a great improvement on my queries then because of doing the while statement.

推荐答案

如果正确使用了索引,则使用JOIN几乎总是更有效.添加重点是因为最佳效率并不总是等于最佳性能.

If the indexes are properly used, then it is almost always more efficient to use a JOIN. The emphasis is added because best efficiency does not always equal best performance.

但是,实际上并没有一个一刀切"的答案.您应该使用EXPLAIN分析查询,以确保确实使用了索引,没有不必要的临时表使用,等等.在某些情况下,条件共同创建了一个仅不能使用索引.在这种情况下,可能以您指定的方式更快地将查询分为几部分.

There isn't really a one-size-fits all answer, though; you should analyze a query using EXPLAIN to ensure that the indexes are indeed being used, that there is no unnecessary temp table use, etc. In some cases, conditions conspire to create a query that just can't use indexes. In those cases, it might be faster to separate the queries into pieces in the fashion you've indicated.

如果我在现有项目中遇到过这样的代码,我会提出以下问题:检查查询,考虑执行查询的不同方法,确保已考虑了这些问题,为或建立了科学的,事实支持的案例反对实践.确保不对原始开发人员进行尽职调查,因为未使用JOIN会表面上指出不良的数据库或查询设计.但是,最后,结果大声疾呼,如果所有优化和更正仍然导致联接比使用查询片段提供的联接慢,则以更快的解决方案为准.进行基准测试并根据基准测试结果采取行动;在软件设计中,绝不应该为了降低性能而牺牲对应做或不应该做的任意规则的遵守.表现最好的方法就是最好的方法.

If I encountered such code in an existing project, I would question it: check the query, think of different ways to perform the query, make sure that these things have been considered, build a scientific, fact-supported case for or against the practice. Make sure that the original developers did their due diligence, since not using a JOIN superficially points to poor database or query design. In the end, though, the results speak loudly and if all the optimizations and corrections still result in a slower join than using query fragments provides, then the faster solution prevails. Benchmark and act on the results of the benchmark; there is no case in software design that you should trade poor performance for adhesion to arbitrary rules about what you should or should not do. The best-performing method is the best method.

这篇关于JOINS vs. while语句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆