在PHP内部加入分解 [英] join decomposition inside PHP

查看:89
本文介绍了在PHP内部加入分解的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我碰到一篇有关Join分解的文章.

I came across an article about Join decomposition.

场景#1(不好):

Select * from tag
Join tag_post ON tag_post.tag_id=tag.id
Join post ON tag_post.post_id=post.id
Where tag.tag='mysql'

场景#2(好):

Select * from tag where tag='mysql'

Select * from tag_post Where tag_id=1234

Select * from post where post.id in (123,456,9098,545)

出于种种原因(特别是缓存),建议坚持方案#2. 问题是如何加入我们的应用程序.你能用PHP给我们一个例子吗 分别取回它们之后? (我已阅读 MyISAM Performance:Join Decomposition? 但这没有帮助)

It was suggested to stick to scenario #2 for many reasons specially caching. The question is how to join inside our application. Could u give us an example with PHP after retrieving them individually? (I have read MyISAM Performance: Join Decomposition? but it did not help)

推荐答案

您可以使用SQL子选择(如果我理解您的问题).当SQL具有所有功能时,使用PHP会很奇怪.

You COULD use an SQL subselect (if I understand your question). Using PHP would be rather odd while SQL has all the capabilities.

SELECT *
FROM `post`
WHERE `id` IN (
    SELECT `post_id`
    FROM `tag_post`
    WHERE `tag_id` = (
        SELECT `tag_id`
        FROM `tag`
        WHERE `tag` = 'mysql'
    )
)

我不确定您的数据库结构如何,但这应该可以帮助您入门.这几乎是SQL的开始.查询中的查询.您可以使用子选择的结果选择数据.

I'm not sure how your database structure looks like, but this should get you started. It's pretty much SQL inception. A query within a query. You can select data using the result of a subselect.

请在复制此SQL并告诉我它不起作用之前,请验证所有表名和列名.

Please, before copying this SQL and telling me it's not working, verify all table and column names.

在任何人开始哭诉速度,缓存和效率之前:我认为这是相当有效的.不必选择所有数据并使用PHP进行遍历,您只需使用本机SQL即可选择较小的位.

Before anyone starts to cry about speed, caching and efficiency: I think this is rather efficient. Instead of selecting ALL data and loop through it using PHP you can just select smaller bits using native SQL as it was ment to be used.

同样,我强烈不鼓励使用PHP获取特定数据.您只需SQL.

Again, I highly discourage to use PHP to get specific data. SQL is all you need.

这是您的脚本

假设您有一些包含所有数据的多维数组:

Assuming you have some multi-dimensional arrays containing all data:

// dummy results

// table tag
$tags = array(
    // first record
    array(
        'id'    => 0,
        'tag'   => 'mysql'
    ), 
    // second record
    array(
        'id'    => 1,
        'tag'   => 'php'
    )
    // etc
);

// table tag_post
$tag_posts = array(
    // first record
    array(
        'id'        => 0,
        'post_id'   => 0,   // post #1
        'tag_id'    => 0    // has tag mysql
    ),
    // second record
    array(
        'id'        => 1,
        'post_id'   => 1,   // post #2
        'tag_id'    => 0    // has tag mysql
    ),
    // second record
    array(
        'id'        => 2,
        'post_id'   => 2,   // post #3
        'tag_id'    => 1    // has tag mysql
    )
    // etc
);

// table post
$posts = array(
    // first record
    array(
        'id'        => 0,
        'content'   => 'content post #1'
    ),
    // second record
    array(
        'id'        => 1,
        'content'   => 'content post #2'
    ),
    // third record
    array(
        'id'        => 2,
        'content'   => 'content post #3'
    )
    // etc
);

// searching for tag
$tag = 'mysql';
$tagid = -1;
$postids = array();
$results = array();

// first get the id of this tag
foreach($tags as $key => $value) {
    if($value['tag'] === $tag) {
        // set the id of the tag
        $tagid = $value['id'];

        // theres only one possible id, so we break the loop
        break;
    }
}

// get post ids using the tag id
if($tagid > -1) { // verify if a tag id was found
    foreach($tag_posts as $key => $value) {
        if($value['tag_id'] === $tagid) {
            // add post id to post ids
            $postids[] = $value['post_id'];
        }
    }
}

// finally get post content
if(count($postids) > 0) { //verify if some posts were found
    foreach($posts as $key => $value) {
        // check if the id of the post can be found in the posts ids we have found
        if(in_array($value['id'], $postids)) {
            // add all data of the post to result
            $results[] = $value;
        }
    }
}

如果您看一下上面脚本的长度,这正是我坚持使用SQL的原因.

If you look at the length of the script above, this is exactly why I'd stick to SQL.

现在,正如我记得的那样,您想使用PHP join,而不是使用SQL.这不是联接,而是使用某些数组获得结果.我知道,但是加入仅会浪费时间,而且效率不如仅保留所有结果一样.

Now, as I recall, you wanted to join using PHP, rather doing it in SQL. This is not a join but getting results using some arrays. I know, but a join would only be a waste of time and less efficient than just leaving all results as they are.

21-12-12,因为下面的评论

我做了一些基准测试,结果非常惊人:

I've done a little benchmark and the results are quite stunning:

DATABASE RECORDS:
tags:           10
posts:          1000
tag_posts:      1000 (every post has 1 random tag)

Selecting all posts with a specific tag resulted in 82 records.

SUBSELECT RESULTS:
run time:                        0.772885084152
bytes downloaded from database:  3417

PHP RESULTS:
run time:                        0.086599111557
bytes downloaded from database:  48644



Please note that the benchmark had both the application as the database on the
same host. If you use different hosts for the application and the database layer,
the PHP result could end up taking longer because naturally sending data between
two hosts will take much more time then when they're on the same host.

即使子选择返回的数据少得多,请求的持续时间却长了将近10倍...

Even though the subselect returns much less data, the duration of the requests is nearly 10 times longer...

我从来没有期望过这些结果,所以我确信,当我知道性能很重要时,我一定会使用此信息,但是我仍然会使用SQL进行较小的操作...

I've NEVER expected these results, so I'm convinced and I will certainly use this information when I know that performance is important however I will still use SQL for smaller operations hehe...

这篇关于在PHP内部加入分解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆