我应该使用JOIN函数还是在循环结构中运行几个查询? [英] Should I use a JOIN function or run several queries in a loop structure?

查看:85
本文介绍了我应该使用JOIN函数还是在循环结构中运行几个查询?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这2个mysql表:TableA和TableB

I have this 2 mysql tables: TableA and TableB

表A
* ColumnAId
* ColumnA1
* ColumnA2
TableB
* ColumnBId
* ColumnAId
* ColumnB1
* ColumnB2

TableA
* ColumnAId
* ColumnA1
* ColumnA2
TableB
* ColumnBId
* ColumnAId
* ColumnB1
* ColumnB2

在PHP中,我想使用这种多维数组格式

In PHP, I wanted to have this multidimensional array format

$array = array(
    array(
        'ColumnAId' => value,
        'ColumnA1' => value,
        'ColumnA2' => value,
        'TableB' => array(
            array(
                'ColumnBId' => value,
                'ColumnAId' => value,
                'ColumnB1' => value,
                'ColumnB2' => value
            )
        )
    )
);

这样我就可以用这种方式循环

so that I can loop it in this way

foreach($array as $i => $TableA) {
    echo 'ColumnAId' . $TableA['ColumnAId'];
    echo 'ColumnA1' . $TableA['ColumnA1'];
    echo 'ColumnA2' . $TableA['ColumnA2'];
    echo 'TableB\'s';
    foreach($value['TableB'] as $j => $TableB) {
        echo $TableB['...']...
        echo $TableB['...']...
    }
}

我的问题是,什么是查询MySQL数据库的最佳方法或正确方法,以便我可以实现此目标?

My problem is that, what is the best way or the proper way of querying MySQL database so that I can achieve this goal?

解决方案1 ​​ ---我正在使用的那个

Solution1 --- The one I'm using

$array = array();
$rs = mysqli_query("SELECT * FROM TableA", $con);
while ($row = mysqli_fetch_assoc($rs)) {
    $rs2 = mysqli_query("SELECT * FROM Table2 WHERE ColumnAId=" . $row['ColumnAId'], $con);
    // $array = result in array
    $row['TableB'] = $array2;
}

我怀疑我的代码会导致其始终查询数据库.

I'm doubting my code cause its always querying the database.

解决方案2

$rs = mysqli_query("SELECT * FROM TableA JOIN TableB ON TableA.ColumnAId=TableB.ColumnAId");
while ($row = mysqli_fet...) {
    // Code
}

第二个解决方案只查询一次,但是如果我每个TableB.ColumnAId(1 TableA.ColumnAId = 1000 TableB.ColumnAId)在TableA中有数千行,而在TableB中有数千行,那么此解决方案2比解决方案1?

The second solution only query once, but if I have thousand of rows in TableA and thousand of rows in TableB for each TableB.ColumnAId (1 TableA.ColumnAId = 1000 TableB.ColumnAId), thus this solution2 takes much time than the solution1?

推荐答案

提出的两个解决方案都不是最优的,但是解决方案1不可预测,因此固有地存在缺陷!

在处理大型数据库时,您要学习的第一件事是,执行查询的最佳方法"通常取决于数据库中的因素(称为元数据):

Neither of the two solutions proposed are probably optimal, BUT solution 1 is UNPREDICTABLE and thus INHERENTLY FLAWED!

One of the first things you learn when dealing with large databases is that 'the best way' to do a query is often dependent upon factors (referred to as meta-data) within the database:

  • 有多少行.
  • 您要查询多少张表.
  • 每行的大小.

因此,不可能有解决您问题的灵丹妙药.您的数据库与数据库不同,如果需要最佳性能,则需要对不同的优化进行基准测试.

Because of this, there's unlikely to be a silver bullet solution for your problem. Your database is not the same as my database, you will need to benchmark different optimizations if you need the best performance available.

您可能会发现应用&在数据库中建立正确的索引(并了解MySQL中索引的本机实现)对您来说还有很多好处.

You will probably find that applying & building correct indexes (and understanding the native implementation of indexes in MySQL) in your database does a lot more for you.

对于查询,有一些黄金法则应该很少打破:

There are some golden rules with queries which should rarely be broken:

  • 不要在循环结构中使用它们.像往常一样诱人的是,创建连接,执行查询和获得响应的开销很高.
  • 除非需要,否则避免使用SELECT * .选择更多的列将大大增加SQL操作的开销.
  • 了解索引.使用EXPLAIN功能,以便您可以查看正在使用的索引,优化查询以使用可用的索引并创建新的索引.
  • Don't do them in loop structures. As tempting as it often is, the overhead on creating a connection, executing a query and getting a response is high.
  • Avoid SELECT * unless needed. Selecting more columns will significantly increase overhead of your SQL operations.
  • Know thy indexes. Use the EXPLAIN feature so that you can see which indexes are being used, optimize your queries to use what's available and create new ones.

因此,在第二个查询中(只用所需的列替换SELECT *), 也许有更好的方法来构造查询有时间进行优化.

Because of this, of the two I'd go for the second query (replacing SELECT * with only the columns you want), but there are probably better ways to structure the query if you have the time to optimize.

但是,速度不是唯一的考虑因素,这是一个很大的理由不使用建议一:

However, speed should NOT be your only consideration in this, there is a GREAT reason not to use suggestion one:

其他答案之一表明,长时间锁定表是一件坏事,因此多查询解决方案是很好的.

One of the other answers suggests that having the table locked for a long period of time is a bad thing, and that therefore the multiple-query solution is good.

我认为这与事实不相符.实际上,我认为在许多情况下,运行单个锁定SELECT查询的可预测性比运行优化&速度优势.

I would argue that this couldn't be further from the truth. In fact, I'd argue that in many cases the predictability of running a single locking SELECT query is a greater argument FOR running that query than the optimization & speed benefits.

首先,当我们在MyISAM或InnoDB数据库(MySQL的默认系统)上运行SELECT(只读)查询时,将发生表锁定的情况.这样可以防止在表上发生任何WRITE操作,直到放弃读锁(我们的SELECT查询完成或失败)为止.其他SELECT查询不受影响,因此,如果您正在运行多线程应用程序,它们将继续起作用.

First of all, when we run a SELECT (read-only) query on a MyISAM or InnoDB database (default systems for MySQL), what happens is that the table is read-locked. This prevents any WRITE operations from happening on the table until the read-lock is surrendered (either our SELECT query completes or fails). Other SELECT queries are not affected, so if you're running a multi-threaded application, they will continue to work.

这种延迟是件好事.为什么,你可能会问?关系数据完整性.

This delay is a GOOD thing. Why, you may ask? Relational data integrity.

让我们举个例子:我们正在运行一项操作,以获取游戏中一堆用户清单中当前存在的物品的列表,因此我们进行以下联接:

Let's take an example: we're running an operation to get a list of items currently in the inventory of a bunch of users on a game, so we do this join:

SELECT * FROM `users` JOIN `items` ON `users`.`id`=`items`.`inventory_id` WHERE `users`.`logged_in` = 1;

如果在此查询操作期间,一个用户将商品交易给另一个用户,会发生什么情况?使用此查询,我们可以看到启动查询时的游戏状态:该商品在运行查询之前在拥有该商品的用户的清单中仅存在一次.

What happens if, during this query operation, a user trades an item to another user? Using this query, we see the game state as it was when we started the query: the item exists once, in the inventory of the user who had it before we ran the query.

根据用户在阅读他的详细信息之前还是之后进行交易,以及我们以哪种顺序阅读两个玩家的库存,有四种可能性:

Depending on whether the user traded it before or after we read his details, and in which order we read the inventory of the two players, there are four possibilities:

  1. 该项目可以显示在第一个用户的库存中(扫描用户B->扫描用户A->已交易的项目,或扫描用户B->扫描用户A->已交易的项目).
  2. 该项目可以显示在第二个用户的库存中(交易项目->扫描用户A->扫描用户B或交易项目->扫描用户B->扫描用户A).
  3. 该物品可以同时显示在两个库存中(扫描用户A->交易物品->扫描用户B).
  4. 该项目可能不会显示在用户库存的都不中(扫描用户B->交易的项目->扫描用户A).
  1. The item could be shown in the first user's inventory (scan user B -> scan user A -> item traded OR scan user B -> scan user A -> item traded).
  2. The item could be shown in the second user's inventory (item traded -> scan user A -> scan user B OR item traded -> scan user B -> scan user A).
  3. The item could be shown in both inventories (scan user A -> item traded -> scan user B).
  4. The item could be shown in neither of the user's inventories (scan user B -> item traded -> scan user A).

这意味着我们将无法预测查询结果或确保关系完整性.

如果您打算在星期二午夜给ID为1000000的家伙$ 5,000,我希望您手头有$ 10,000.如果您的程序在拍摄快照时依赖于唯一项是唯一的,则这种查询可能会引发异常.

If you're planning to give $5,000 to the guy with item ID 1000000 at midnight on Tuesday, I hope you have $10k on hand. If your program relies on unique items being unique when snapshots are taken, you will possibly raise an exception with this kind of query.

锁定是一件好事,因为它可以增加可预测性并保护结果的完整性.

Locking is good because it increases predictability and protects the integrity of results.

注意:您可以使用来强制循环锁定交易,但它会仍然变慢.

Note: You could force a loop to lock with a transaction, but it will still be slower.

您应该从不声明如下:

mysqli_query("SELECT * FROM Table2 WHERE ColumnAId=" . $row['ColumnAId'], $con);

mysqli具有对准备好的语句的支持.阅读并使用它们,它们将帮助您避免您的数据库发生可怕的事情.

mysqli has support for prepared statements. Read about them and use them, they will help you to avoid something terrible happening to your database.

这篇关于我应该使用JOIN函数还是在循环结构中运行几个查询?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆