如何提高嵌套集模型中数千个插入的速度? [英] How to improve speed of thousand of inserts in a Nested Set model?
问题描述
我正在使用MySQL。我想插入100.000个属于不同级别(root_id,lft,rgt,级别)的嵌套集模型的位置。如果我不必担心这棵树,我将进行批量插入。但是在这种情况下,我必须对此订单上的每个对象执行3个查询:
I'm using MySQL. I want to insert 100.000 places who belong to a Nested Set Model with different levels (root_id, lft, rgt, level). If I didn't have to worry about the tree, I would do a bulk insert; but in this case, I have to do 3 queries for each object on this order:
foreach ( $places as $place )
{
# get parent place from $ancestor_array ( woeid => ids )
$parent = $ancestors[:parent_woeid]
update place set lft = lft + 2 where root_id = :root_id and lft >= :obj_lft;
update place set rgt = rgt + 2 where root_id = :root_id and rgt >= :obj_lft;
insert into place (..., root_id, lft, rgt, level, ...) values (..., :obj_root_id, :obj_lft, :obj_rgt, :obj_level, ...);
...
}
这需要很长时间...所以是时候尝试变得更聪明了。这就是我的想法:
That takes a long time... so it's time to try to be smarter. This is what I'm thinking:
foreach ( $places as $place )
{
# get parent place from $ancestor_array ( woeid => ids )
$parent = $ancestors[:parent_woeid]
$new_admins[] = array('parent_woeid' => :parent_woeid, ...data place....)
$woeids[] = :parent_woeid;
}
# lock + bulk insert of $new_admins with lft=rgt=level=root_id= null + unlock
insert into place (...) values (...), (...), (...), ...., (...)
# get ids of those places
select from place where woeid in(:array_woeids)
# lock + bulk update with 3 updates per place + unlock
update place set lft= lft + 2 where root_id = :root_id and lft >= :obj_lft;
update place set rgt = rgt + 2 where root_id = :root_id and rgt >= :obj_lft;
update place set lft=:obj_lft, rgt=:obj_rgt where id=:id
# i have to update the ancestors lft & rgt values on the $ancestors_array,
# because we have them "cached" in the array in order to avoid doing a
# select per inserted place.
您怎么看?你会怎么做?您是否会将所有插入和更新保存到文件中,并使用 LOAD DATA INFILE
语法代替此选项?
What do you think? How would you do it? Would you save all the inserts and updates into a file and use the LOAD DATA INFILE
syntax instead of this option?
我还有其他选择吗?
非常感谢!
推荐答案
仅在插入所有项目后才插入项目并重新计算所有lft / rgt索引会更有效吗?
Wouldn't it be more efficient to just insert the items and recalculate all the lft/rgt indexes only when all items are inserted?
但是怎么办您必须知道每个项目的父级ID,因此您可以使用一些索引作为基础。我个人成功地将lft + lrg + parentId与每个数据项保持在一起,这使维护变得更加容易(您可以随时进行琐碎的重新计算/完整性检查)。
However to do that you have to know parent ID of each item, so you have something to base the indexes on. I personally had success keeping lft + lrg + parentId with each data item, it makes maintenance much easier (you can do trivial recalculation/integrity check at any time).
这篇关于如何提高嵌套集模型中数千个插入的速度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!