INSERT VS UPDATE:MySQL,700万行 [英] INSERT vs UPDATE: MySQL, 7 million rows

查看:169
本文介绍了INSERT VS UPDATE:MySQL,700万行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于那些选择,这不是一个编程问题,它最终是我必须的1)代码,2)选项#1中每行的数据比较的算法的数量。

我遇到了一些问题,泡菜。我有一个遵循Google GTFS规范的数据库,我现在正在编写一个自动更新程序来为这个数据库服务。

I have run into a bit of a pickle. I have a database which follows Google GTFS specs and I am now writing an automatic update program to service this database.

数据库每3个月进行一次大修。具有最少行数的表由1-10行组成,最大的表包含700万行。其余的介于10 000到80 000之间。

The database gets an overhaul once every 3 months. The table with the least amount of rows consists of between 1-10 rows and the largest table contains 7 million rows. The rest have between 10 000 and 80 000.

我的程序要下载的文件是.txt文件,可以在表中翻译。换句话说:stops.txt = database.stops。数据库是InnoDB类型。

The files my program to download are .txt files which translate in a table. In other words: stops.txt = database.stops. Database is InnoDB type.

我想出了两个解决方案。

I have come up with 2 solutions.


  • 1)。几个.txt文档中的每一行id都要与数据库中的内容进行比较,如果没有发生变化do
    什么都没有,如果有变化,do

  • 1) every row id in the several .txt documents is to be compared to what is currently in the database, if nothing has changed do nothing, if something has changed, do an UPDATE.

2)将下载的文件插入到自己的表中(基本上镜像活动表) 。示例:
database.stop_new和database.stop开关名称。

2) INSERT the downloaded files into their own tables (basically mirroring the live tables) and then switch table names. Example: database.stop_new and database.stop switch names.

另一个扭曲:大修将在一个.txt文件中定义的某个日期完成,这意味着它可以在一个星期六在高峰假期也许,也就是说,用户可以随时发送查询。

Another twist: The overhaul will be done at a certain date defined in one of the .txt files, this means it could be done a Saturday on a peak holiday prehaps, meaning that users could be sending queries at all times.

问题:我应该采用哪种方法来确保没有中断,并且用户体验得到维护。 (我倾向于选项2 ...)

Question: Which approach should I go for to endure that nothing breaks and that the user experience is upheld. (I am leaning towards option 2...)

推荐答案

根据MySQL文档...

According to MySQL documentation...


MySQL对MyISAM,MEMORY和MERGE表使用表级锁定,对BDB表使用
页级锁定,对InnoDB使用行级锁
表。

MySQL uses table-level locking for MyISAM, MEMORY and MERGE tables, page-level locking for BDB tables, and row-level locking for InnoDB tables.

您将能够对表进行行级锁定,而不是使整个内容不可用...

You will be able to row-level lock the table, instead of rendering the entire contents unusable...


表更新优先级高于表检索。

Table updates are given higher priority than table retrievals.


$ b b

更新优先于选择,并且也基于键值,因此我认为这将是一个更好的选择。

Updates take priority over selects, and is also based on key values, so i think this would be a much better option.

行级锁定


不同会话访问不同行时,锁冲突较少

Fewer lock conflicts when different sessions access different rows

回滚更改

可能长时间锁定单行

>行级锁定的缺点:


级别或表级锁定

Requires more memory than page-level or table-level locks

当用于表的大部分
时,比页级锁或表级锁慢,因为您必须获取更多锁

Slower than page-level or table-level locks when used on a large part of the table because you must acquire many more locks

比其他锁慢,如果你经常对一个大的
部分数据执行GROUP BY操作,或者如果你必须经常扫描整个表

Slower than other locks if you often do GROUP BY operations on a large part of the data or if you must scan the entire table frequently

但是,一般来说,表锁比MySQL文档...优于行级锁...

However, in general table-locks are superior to row-level locks according to MySQL Documentation...

另一个选项...


您可以使用应用程序级
锁,例如
MySQL中的GET_LOCK()和RELEASE_LOCK()提供的锁。这些是咨询锁,因此它们只与应用程序
相互合作。请参见第12.14节杂项
函数。

Instead of using row-level locks, you can employ application-level locks, such as those provided by GET_LOCK() and RELEASE_LOCK() in MySQL. These are advisory locks, so they work only with applications that cooperate with each other. See Section 12.14, "Miscellaneous Functions".

这篇关于INSERT VS UPDATE:MySQL,700万行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆