从庆典的MySQL批量插入相关表 [英] Bulk insert of MySQL related tables from bash

查看:85
本文介绍了从庆典的MySQL批量插入相关表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要相当大量的数据定期上传到CSV文件MySQL数据库。我使用的只是从的bash脚本执行LOAD DATA INFILE做到这一点。但是现在,数据为S $ P $垫在多个表和关系都被保留。什么是一般的策略在这种情况下?

I need to upload regularly quite a bulk of data to a MySQL database from CSV files. I used do this by simply executing LOAD DATA INFILE from bash scripts. Now however, the data are to be spread over several tables and relations are to be kept. What are general strategies in such cases?

让我们假设一个最初简单的任务:关系一对许多人来说,两个表

Let's assume an initially simple task: relation one-to-many, two tables.

我认为是这样的:


  1. 获得最大identyfier为表1

  2. 手动应用标识符CSV文件

  3. 分裂有两个目标表在文件中记

  4. 插入两个表

它是一个最佳的解决方案? (在例如真实案例我将有很多很多一对多的关系,这种方式进行更新的。)

Is it an optimal solution? (In the real case for example I'm going to have lots of many-to-many relations to be updated this way.)

我可以从庆典对整个过程的持续时间的水平锁表1?还是我必须使用像Perl或Python的一些中介工具,把所有的事情在一个会话?

Can I lock the table 1 from the level of bash for the duration of whole the process? Or do I have to use some intermediary tool like perl or Python to keep all the things in one session?

推荐答案

有你的问题pssed各种相互矛盾的要求前$ P $。这个答案集中在了它的保持锁定的局面。

There are various conflicting requirements expressed in your question. This answer concentrates on the "keep lock" aspect of it.

为了保持整个操作的表锁,你必须要保持到SQL Server的单一连接。一种方法是通过一切,多行多命令输入到mysql命令行客户端的一个调用。基本上是这样的:

In order to maintain a table lock for the whole operation, you'll have to maintain a single connection to the sql server. One way would be passing everything as a multi-line multi-command input to a single invocation of the mysql command line client. Basically like this:

{ echo "LOCK TABLES Table1 WRITE"
  for i in "${infiles[@]}"; do
    echo "LOAD DATA LOCAL INFILE '${i}'"
  done
} | mysql

这会只要你可以生成所有需要的报表工作而不问问题从数据库(如最大标识符),而锁被保留。

That would work as long as you can generate all the required statements without asking questions from the database (like maximal identifier) while the lock is kept.

为了混读操作(如要求的最大值)和写操作(如某些文件的加载内容),你会NEAD与服务器的双向通信。通过庆典实现,这是非常棘手的,所以我建议反对。即使你不需要问的问题,通过一个bash管上的单向连接是危险的根源:如果出现任何错误上的MySQL侧时,bash不会注意到反正会发出下一个命令。你可能最终提交不一致的数据。

In order to mix read operations (like asking for a maximal value) and write operations (like loading content of some files), you'll nead a bidirectional communication with the server. Achieving this through bash is very tricky, so I'd advise against it. Even if you don't need to ask questions, the unidirectional connection provided by a bash pipe is a source of danger: If anything goes wrong on the mysql side, bash won't notice and will issue the next command anyway. You might end up committing inconsistent data.

由于这些原因,我宁愿推荐一些脚本语言的mysql哪个绑定是可用的,就像你提到的Perl或Pyhon选项。在这些语言阅读CVS文件很容易,所以你可能会做所有在一个脚本如下:

For these reasons, I'd rather suggest some scripting language for which mysql bindings are available, like the Perl or Pyhon options you mentioned. Reading CVS files in those languages is easy, so you might do all of the following in a single script:


  1. LOCK TABLES

  2. 启动事务

  3. 读取输入的CSV文件

  4. 问这样的最大问题ID

  5. 调整输入数据来匹配表格布局

  6. 将数据插入表

  7. 如果没有发生错误,提交事务

这篇关于从庆典的MySQL批量插入相关表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆