有效地将数据插入C#中多个表的MySQL中 [英] Insert Data into MySQL in multiple Tables in C# efficiently

查看:129
本文介绍了有效地将数据插入C#中多个表的MySQL中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在mySQL数据库中以1:n关系向2个表中插入一个巨大的CSV文件.

I need to insert a huge CSV-File into 2 Tables with a 1:n relationship within a mySQL Database.

该CSV文件每周发送一次,大约有1GB,需要将其附加到现有数据中. 他们两个表中的每个表都有一个自动递增主键.

The CSV-file comes weekly and has about 1GB, which needs to be append to the existing data. Each of them 2 tables have a Auto increment Primary Key.

我尝试过:

  • 实体框架(在所有方法中花费最多的时间)
  • 数据集(相同)
  • 批量上传(不支持多个表)
  • 带有参数的MySqlCommand(需要嵌套,这是我目前的方法)
  • 具有存储过程的MySqlCommand包括事务

还有其他建议吗?

让我们简化一下,这是我的数据结构:

Let's say simplified this is my datastructure:

public class User
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public List<string> Codes { get; set; }
}

我需要从csv插入此数据库:

I need to insert from the csv into this database:

       User   (1-n)   Code     
+---+-----+-----+ +---+---+-----+        
|PID|FName|LName| |CID|PID|Code | 
+---+-----+-----+ +---+---+-----+
| 1 |Jon  | Foo | | 1 | 1 | ed3 | 
| 2 |Max  | Foo | | 2 | 1 | wst | 
| 3 |Paul | Foo | | 3 | 2 | xsd | 
+---+-----+-----+ +---+---+-----+ 

这是CSV文件的示例行

Here a sample line of the CSV-file

Jon;Foo;ed3,wst

由于我的写权限受到限制,因此无法像LOAD DATA LOCAL INFILE那样批量加载

A Bulk load like LOAD DATA LOCAL INFILE is not possible because i have restricted writing rights

推荐答案

鉴于数据量很大,最好的方法(从性能角度考虑)是将尽可能多的数据处理留给数据库而不是应用程序处理.

Given the great size of data, the best approach (performance wise) is to leave as much data processing to the database and not the application.

创建一个临时表,该文件将临时保存.csv文件中的数据.

Create a temporary table that the data from the .csv file will be temporarily saved.

CREATE TABLE `imported` (
    `id` int(11) NOT NULL,
    `firstname` varchar(45) DEFAULT NULL,
    `lastname` varchar(45) DEFAULT NULL,
    `codes` varchar(450) DEFAULT NULL,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

将数据从.csv加载到此表非常简单.我建议使用MySqlCommand(这也是您当前的方法).此外,对所有INSERT语句 使用相同的 MySqlConnection对象将减少总的执行时间.

Loading the data from the .csv to this table is pretty straightforward. I would suggest the use of MySqlCommand (which is also your current approach). Also, using the same MySqlConnection object for all INSERT statements will reduce the total execution time.

然后,为了进一步处理数据,您可以创建一个存储过程来处理它.

Then to furthermore process the data, you can create a stored procedure that will handle it.

假设这两个表(摘自您的简化示例):

Assuming these two tables (taken from your simplified example):

CREATE TABLE `users` (
  `PID` int(11) NOT NULL AUTO_INCREMENT,
  `FName` varchar(45) DEFAULT NULL,
  `LName` varchar(45) DEFAULT NULL,
  PRIMARY KEY (`PID`)
) ENGINE=InnoDB AUTO_INCREMENT=3737 DEFAULT CHARSET=utf8;

CREATE TABLE `codes` (
  `CID` int(11) NOT NULL AUTO_INCREMENT,
  `PID` int(11) DEFAULT NULL,
  `code` varchar(45) DEFAULT NULL,
  PRIMARY KEY (`CID`)
) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=utf8;

您可以具有以下存储过程.

you can have the following stored procedure.

CREATE DEFINER=`root`@`localhost` PROCEDURE `import_data`()
BEGIN
    DECLARE fname VARCHAR(255);
    DECLARE lname VARCHAR(255);
    DECLARE codesstr VARCHAR(255);
    DECLARE splitted_value VARCHAR(255);
    DECLARE done INT DEFAULT 0;
    DECLARE newid INT DEFAULT 0;
    DECLARE occurance INT DEFAULT 0;
    DECLARE i INT DEFAULT 0;

    DECLARE cur CURSOR FOR SELECT firstname,lastname,codes FROM imported;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

    OPEN cur;

    import_loop: 
        LOOP FETCH cur INTO fname, lname, codesstr;
            IF done = 1 THEN
                LEAVE import_loop;
            END IF;

            INSERT INTO users (FName,LName) VALUES (fname, lname);
            SET newid = LAST_INSERT_ID();

            SET i=1;
            SET occurance = (SELECT LENGTH(codesstr) - LENGTH(REPLACE(codesstr, ',', '')) + 1);

            WHILE i <= occurance DO
                SET splitted_value =
                    (SELECT REPLACE(SUBSTRING(SUBSTRING_INDEX(codesstr, ',', i),
                    LENGTH(SUBSTRING_INDEX(codesstr, ',', i - 1)) + 1), ',', ''));

                INSERT INTO codes (PID, code) VALUES (newid, splitted_value);
                SET i = i + 1;
            END WHILE;
        END LOOP;
    CLOSE cur;
END

对于源数据中的每一行,它为user表创建一个INSERT语句.然后有一个WHILE循环,以拆分逗号分隔的代码,并为每个代码分别为codes表提供一个INSERT语句.

For every row in the source data, it makes an INSERT statement for the user table. Then there is a WHILE loop to split the comma separated codes and make for each one an INSERT statement for the codes table.

关于LAST_INSERT_ID()的使用,在每个连接上都是可靠的(

Regarding the use of LAST_INSERT_ID(), it is reliable on a PER CONNECTION basis (see doc here). If the MySQL connection used to run this stored procedure is not used by other transactions, the use of LAST_INSERT_ID() is safe.

生成的ID会在每个连接的基础上保留在服务器中.这意味着函数返回给定客户端的值是为该客户端影响AUTO_INCREMENT列的最新语句生成的第一个AUTO_INCREMENT值.即使其他客户端生成自己的AUTO_INCREMENT值,该值也不会受到其他客户端的影响.这种行为可确保每个客户端都可以检索自己的ID,而无需担心其他客户端的活动,也不需要锁或事务.

The ID that was generated is maintained in the server on a per-connection basis. This means that the value returned by the function to a given client is the first AUTO_INCREMENT value generated for most recent statement affecting an AUTO_INCREMENT column by that client. This value cannot be affected by other clients, even if they generate AUTO_INCREMENT values of their own. This behavior ensures that each client can retrieve its own ID without concern for the activity of other clients, and without the need for locks or transactions.

编辑:这是省略临时表imported的OP的变体.无需将数据从.csv插入到imported表中,而是调用SP将它们直接存储到数据库中.

Edit: Here is the OP's variant that omits the temp-table imported. Instead of inserting the data from the .csv to the imported table, you call the SP to directly store them to your database.

CREATE DEFINER=`root`@`localhost` PROCEDURE `import_data`(IN fname VARCHAR(255), IN lname VARCHAR(255),IN codesstr VARCHAR(255))
BEGIN
    DECLARE splitted_value VARCHAR(255);
    DECLARE done INT DEFAULT 0;
    DECLARE newid INT DEFAULT 0;
    DECLARE occurance INT DEFAULT 0;
    DECLARE i INT DEFAULT 0;

    INSERT INTO users (FName,LName) VALUES (fname, lname);
    SET newid = LAST_INSERT_ID();

    SET i=1;
    SET occurance = (SELECT LENGTH(codesstr) - LENGTH(REPLACE(codesstr, ',', '')) + 1);

    WHILE i <= occurance DO
        SET splitted_value =
            (SELECT REPLACE(SUBSTRING(SUBSTRING_INDEX(codesstr, ',', i),
            LENGTH(SUBSTRING_INDEX(codesstr, ',', i - 1)) + 1), ',', ''));

        INSERT INTO codes (PID, code) VALUES (newid, splitted_value);
        SET i = i + 1;
    END WHILE;
END

注意:用于拆分代码的代码取自

Note: The code to split the codes is taken from here (MySQL does not provide a split function for strings).

这篇关于有效地将数据插入C#中多个表的MySQL中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆