MySQL vs MongoDB 1000读取 [英] MySQL vs MongoDB 1000 reads

查看:61
本文介绍了MySQL vs MongoDB 1000读取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对MongoDb感到非常兴奋,并且最近对其进行了测试.我在MySQL中有一个名为posts的表,其中大约2000万条记录仅在名为"id"的字段上建立索引.

I have been very excited about MongoDb and have been testing it lately. I had a table called posts in MySQL with about 20 million records indexed only on a field called 'id'.

我想比较MongoDB的速度,并且运行了一个测试,该测试将从我们的大型数据库中随机获取并打印15条记录.我为mysql和MongoDB分别运行了约1000次查询,但我很惊讶我没有注意到速度上的很大差异.也许MongoDB快1.1倍.真令人失望.我做错什么了吗?我知道我的测试并不完美,但是在阅读繁琐的杂物时,MySQL是否可以与MongoDb媲美.

I wanted to compare speed with MongoDB and I ran a test which would get and print 15 records randomly from our huge databases. I ran the query about 1,000 times each for mysql and MongoDB and I am suprised that I do not notice a lot of difference in speed. Maybe MongoDB is 1.1 times faster. That's very disappointing. Is there something I am doing wrong? I know that my tests are not perfect but is MySQL on par with MongoDb when it comes to read intensive chores.


注意:

  • 我有双核+(2个线程)i7 cpu和4GB内存
  • 我在MySQL上有20个分区,每个分区有100万条记录

用于测试MongoDB的示例代码

<?php
function microtime_float()
{
    list($usec, $sec) = explode(" ", microtime());
    return ((float)$usec + (float)$sec);
}
$time_taken = 0;
$tries = 100;
// connect
$time_start = microtime_float();

for($i=1;$i<=$tries;$i++)
{
    $m = new Mongo();
    $db = $m->swalif;
    $cursor = $db->posts->find(array('id' => array('$in' => get_15_random_numbers())));
    foreach ($cursor as $obj)
    {
        //echo $obj["thread_title"] . "<br><Br>";
    }
}

$time_end = microtime_float();
$time_taken = $time_taken + ($time_end - $time_start);
echo $time_taken;

function get_15_random_numbers()
{
    $numbers = array();
    for($i=1;$i<=15;$i++)
    {
        $numbers[] = mt_rand(1, 20000000) ;

    }
    return $numbers;
}

?>


用于测试MySQL的示例代码

<?php
function microtime_float()
{
    list($usec, $sec) = explode(" ", microtime());
    return ((float)$usec + (float)$sec);
}
$BASE_PATH = "../src/";
include_once($BASE_PATH  . "classes/forumdb.php");

$time_taken = 0;
$tries = 100;
$time_start = microtime_float();
for($i=1;$i<=$tries;$i++)
{
    $db = new AQLDatabase();
    $sql = "select * from posts_really_big where id in (".implode(',',get_15_random_numbers()).")";
    $result = $db->executeSQL($sql);
    while ($row = mysql_fetch_array($result) )
    {
        //echo $row["thread_title"] . "<br><Br>";
    }
}
$time_end = microtime_float();
$time_taken = $time_taken + ($time_end - $time_start);
echo $time_taken;

function get_15_random_numbers()
{
    $numbers = array();
    for($i=1;$i<=15;$i++)
    {
        $numbers[] = mt_rand(1, 20000000);

    }
    return $numbers;
}
?>

推荐答案

MongoDB并不是神奇的更快.如果您以相同的方式存储相同的数据,并以完全相同的方式对其进行访问,那么您真的不应该期望结果会大相径庭.毕竟,MySQL和MongoDB都是GPL,因此,如果Mongo中包含一些神奇的更好的IO代码,则MySQL团队可以将其合并到他们的代码库中.

MongoDB is not magically faster. If you store the same data, organised in basically the same fashion, and access it exactly the same way, then you really shouldn't expect your results to be wildly different. After all, MySQL and MongoDB are both GPL, so if Mongo had some magically better IO code in it, then the MySQL team could just incorporate it into their codebase.

人们看到现实世界中MongoDB的性能很大程度上是因为MongoDB允许您以更适合您的工作量的另一种方式进行查询.

People are seeing real world MongoDB performance largely because MongoDB allows you to query in a different manner that is more sensible to your workload.

例如,考虑一种设计,该设计以规范化方式保留了有关复杂实体的许多信息.这样可以轻松地使用MySQL(或任何关系数据库)中的数十个表以正常形式存储数据,并需要许多索引来确保表之间的关系完整性.

For example, consider a design that persisted a lot of information about a complicated entity in a normalised fashion. This could easily use dozens of tables in MySQL (or any relational db) to store the data in normal form, with many indexes needed to ensure relational integrity between tables.

现在考虑与文档存储相同的设计.如果所有这些相关表都从属于主表(并且经常属于主表),则您可能能够对数据进行建模,以便将整个实体存储在单个文档中.在MongoDB中,您可以将其作为单个文档存储在单个集合中.这就是MongoDB开始提供卓越性能的地方.

Now consider the same design with a document store. If all of those related tables are subordinate to the main table (and they often are), then you might be able to model the data such that the entire entity is stored in a single document. In MongoDB you can store this as a single document, in a single collection. This is where MongoDB starts enabling superior performance.

在MongoDB中,要检索整个实体,您必须执行:

In MongoDB, to retrieve the whole entity, you have to perform:

  • 对集合进行一次索引查找(假设实体是通过id获取的)
  • 检索一个数据库页面的内容(实际的二进制json文档)

因此,进行b树查找,并读取二进制页面. Log(n)+ 1个IO如果索引可以完全驻留在内存中,则为1 IO.

So a b-tree lookup, and a binary page read. Log(n) + 1 IOs. If the indexes can reside entirely in memory, then 1 IO.

在具有20个表的MySQL中,您必须执行:

In MySQL with 20 tables, you have to perform:

  • 在根表上进行一次索引查找(同样,假设该实体是通过id获取的)
  • 使用聚集索引,我们可以假设根行的值在索引中
  • 该实体的pk值的20多次范围查找(希望是在索引上)
  • 这些可能不是聚簇索引,因此一旦我们确定了适当的子行是什么,就进行了20多次相同的数据查找.

因此,即使假设所有索引都在内存中(这很难做到,因为它们的数量是20倍),所以mysql的总数约为20个范围查询.

So the total for mysql, even assuming that all indexes are in memory (which is harder since there are 20 times more of them) is about 20 range lookups.

这些范围查找可能由随机IO组成-不同的表肯定会驻留在磁盘上的不同位置,并且同一实体的同一表中同一范围内的不同行可能不连续(取决于实体已更新,等等.

These range lookups are likely comprised of random IO — different tables will definitely reside in different spots on disk, and it's possible that different rows in the same range in the same table for an entity might not be contiguous (depending on how the entity has been updated, etc).

因此,在此示例中,与MongoDB相比,每个逻辑访问的最终IO使用MySQL的IO大约多20倍.

So for this example, the final tally is about 20 times more IO with MySQL per logical access, compared to MongoDB.

在某些用例中,这是MongoDB如何提高性能的方法..

This is how MongoDB can boost performance in some use cases.

这篇关于MySQL vs MongoDB 1000读取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆