为MySQL建立一个批量查询,每插入1000个项目 [英] Build a batch query for MySQL insert each 1000 items

查看:93
本文介绍了为MySQL建立一个批量查询,每插入1000个项目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在MySQL/MariaDB中执行批量插入,但是由于数据是动态的,因此我需要构建正确的SQL查询.只需几个步骤:

  • 我应该找到表中当前行是否存在-这是循环内的第一个SELECT
  • 现在我有1454,但是必须在大约150k以后插入,这比在循环上每项150k INSERT更好地是一个批处理查询
  • 如果记录已经存在,我应该更新它,如果不存在,我应该插入它,我只是不关心UPDATE而您看到的代码仅适用于INSERT

这就是我在做什么:

// Get values from Csv file as an array of values
$data = convertCsvToArray($fileName);
echo "DEBUG count(data): ", count($data), "\n";

$i = 0;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) ";

// Processing on each row of data
foreach ($data as $row) {
    $sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
    echo "DEBUG: ", $sql, "\n";
    $rs = $conn->query($sql);

    if ($rs === false) {
        echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
    } else {
        $rows_returned = $rs->num_rows;

        $veeva_rep_id = "'".$conn->real_escape_string($row['Id'])."'";
        $first = "'".$conn->real_escape_string(ucfirst(strtolower($row['FirstName'])))."'";
        $last = "'".$conn->real_escape_string(ucfirst(strtolower($row['LastName'])))."'";
        $email = "'".$conn->real_escape_string($row['Email'])."'";
        $username = "'".$conn->real_escape_string($row['Username'])."'";
        $display_name = "'".$conn->real_escape_string(
                ucfirst(strtolower($row['FirstName'])).' '.ucfirst(strtolower($row['LastName']))
            )."'";

        // VALUES should be added only if row doesn't exists
        if ($rows_returned === 0) {

            // VALUES should be append until they reach 1000
            while ($i % 1000 !== 0) {
                $sqlInsert .= "VALUES($veeva_rep_id,$first,$last,$email,$username,NOW(),NOW(),$display_name,'VEEVA','https://pdone.s3.amazonaws.com/avatar/default_avatar.png',NOW(),NOW())";
                ++$i;;
            }

            // QUERY should be output to console to see if it's right or something is wrong
            echo "DEBUG: ", $sqlInsert, "\n";

            // QUERY should be executed if there are 1000 VALUES ready to add as a batch

            /*$rs = $conn->query($sqlInsert);

            if ($rs === false) {
                echo 'Wrong SQL: '.$sqlInsert.' Error: '.$conn->error, E_USER_ERROR;*/
            }
        } else {
            // UPDATE
            echo "UPDATE";
        }
    }
}

但是这行代码:echo "DEBUG: ", $sql, "\n";没有向控制台输出任何内容.我一定做错了,但找不到.有什么可以帮助我建立正确的批处理查询并在每添加1000个值后执行该查询吗?

正确的输出应为:

DEBUG count(data): 1454
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008ReolAAC'
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='005800000039SIWAA2'
....
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES(...), VALUES(...), VALUES(...)

获得的结果:

DEBUG count(data): 1454
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008RGg6AAG'
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt)
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008RQ4CAAW'
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt)
.... // until reach 1454 results

该表为空,因此它永远都不能通过ELSE条件(UPDATE 1).

编辑

在答案的帮助下,这是现在的代码外观:

$data = convertCsvToArray($fileName);
echo "DEBUG count(data): ", count($data), "\n";

$i = 1;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES";

foreach ($data as $row) {
    $sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
    $rs = $conn->query($sql);

    if ($rs === false) {
        echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
    } else {
        $rows_returned = $rs->num_rows;

        $veeva_rep_id = "'".$conn->real_escape_string($row['Id'])."'";
        $first = "'".$conn->real_escape_string(ucfirst(strtolower($row['FirstName'])))."'";
        $last = "'".$conn->real_escape_string(ucfirst(strtolower($row['LastName'])))."'";
        $email = "'".$conn->real_escape_string($row['Email'])."'";
        $username = "'".$conn->real_escape_string($row['Username'])."'";
        $display_name = "'".$conn->real_escape_string(
                ucfirst(strtolower($row['FirstName'])).' '.ucfirst(strtolower($row['LastName']))
            )."'";

        if ($rows_returned === 0) {
            if ($i % 1000 === 0) {
                file_put_contents("output.log", $sqlInsert."\n", FILE_APPEND);
                $sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES";
            } else {
                $sqlInsert .= "($veeva_rep_id,$first,$last,$email,$username,NOW(),NOW(),$display_name,'VEEVA','https://pdone.s3.amazonaws.com/avatar/default_avatar.png',NOW(),NOW()), ";
            }

            $i++;
        } else {
            echo "UPDATE";
        }
    }
}

但仍然是越野车,因为:

  • 我有第一个空的INSERT查询:INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES
  • 我还有第二个带有1000 VALUES()追加的INSERT查询,但是其余的怎么了?剩下的454吗?

能给我另一个建议吗?帮助吗?

解决方案

考虑使用INSERT IGNORE INTO表来检查记录是否已存在. 如何在MySQL中如果不存在则插入"? 如果您尚未这样做,请将veeva_rep_id设置为PRIMARY键,以便INSERT IGNORE可以正常工作

还使用PDO签出事务,准备好的语句并使用PDO动态生成查询 PDO Prepared在单个查询中插入多行

<?php

$sql = 'INSERT IGNORE INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES ';

$insertQuery = array();
$insertData = array();

/*

assuming the array from the csv is like this

$data = array(
    0 => array('name' => 'Robert', 'value' => 'some value'),
    1 => array('name' => 'Louise', 'value' => 'another value')
);
*/

foreach ($data as $row) {
    $insertQuery[] = '(:veeva_rep_id' . $n . ', :first' . $n . ', :last' . $n . ', :email' . $n . ', :username' . $n . ', :lastLoginAt' . $n . ', :lastSyncAt' . $n . ', :display_name' . $n . ', :rep_type' . $n . ', :avatar_url' . $n . ', :createdAt' . $n . ', :updatedAt' . $n . ')';
    $insertData['veeva_rep_id' . $n] = $row['name'];
    $insertData['first' . $n] = $row['value'];
    $insertData['last' . $n] = $row['name'];
    $insertData['email' . $n] = $row['value'];
    $insertData['username' . $n] = $row['name'];
    $insertData['lastLoginAt' . $n] = $row['value'];
    $insertData['lastSyncAt' . $n] = $row['value'];
    $insertData['display_name' . $n] = $row['name'];
    $insertData['rep_type' . $n] = $row['value'];
    $insertData['avatar_url' . $n] = $row['value'];
    $insertData['createdAt' . $n] = $row['name'];
    $insertData['updatedAt' . $n] = $row['value'];

    $n++;
}

$db->beginTransaction();

if (!empty($insertQuery) and count($insertQuery)>1000) {
    $sql .= implode(', ', $insertQuery);

    $stmt = $db->prepare($sql);
    $stmt->execute($insertData);
}

$db->commit();

print $sql . PHP_EOL;

让我知道是否有帮助.

I need to perform a batch insert in MySQL/MariaDB but since data is dynamic I need to build the proper SQL query. In a few steps:

  • I should find whether the current row exists or not in table - this is the first SELECT inside the loop
  • Right now I have 1454 but have to insert around 150k later, is better a batch query than 150k INSERT per item on the loop
  • If record already exists I should update it if doesn't then I should insert ,I just not care about UPDATE yet and the code you're seeing is only for INSERT

So here is what I am doing:

// Get values from Csv file as an array of values
$data = convertCsvToArray($fileName);
echo "DEBUG count(data): ", count($data), "\n";

$i = 0;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) ";

// Processing on each row of data
foreach ($data as $row) {
    $sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
    echo "DEBUG: ", $sql, "\n";
    $rs = $conn->query($sql);

    if ($rs === false) {
        echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
    } else {
        $rows_returned = $rs->num_rows;

        $veeva_rep_id = "'".$conn->real_escape_string($row['Id'])."'";
        $first = "'".$conn->real_escape_string(ucfirst(strtolower($row['FirstName'])))."'";
        $last = "'".$conn->real_escape_string(ucfirst(strtolower($row['LastName'])))."'";
        $email = "'".$conn->real_escape_string($row['Email'])."'";
        $username = "'".$conn->real_escape_string($row['Username'])."'";
        $display_name = "'".$conn->real_escape_string(
                ucfirst(strtolower($row['FirstName'])).' '.ucfirst(strtolower($row['LastName']))
            )."'";

        // VALUES should be added only if row doesn't exists
        if ($rows_returned === 0) {

            // VALUES should be append until they reach 1000
            while ($i % 1000 !== 0) {
                $sqlInsert .= "VALUES($veeva_rep_id,$first,$last,$email,$username,NOW(),NOW(),$display_name,'VEEVA','https://pdone.s3.amazonaws.com/avatar/default_avatar.png',NOW(),NOW())";
                ++$i;;
            }

            // QUERY should be output to console to see if it's right or something is wrong
            echo "DEBUG: ", $sqlInsert, "\n";

            // QUERY should be executed if there are 1000 VALUES ready to add as a batch

            /*$rs = $conn->query($sqlInsert);

            if ($rs === false) {
                echo 'Wrong SQL: '.$sqlInsert.' Error: '.$conn->error, E_USER_ERROR;*/
            }
        } else {
            // UPDATE
            echo "UPDATE";
        }
    }
}

But this line of code: echo "DEBUG: ", $sql, "\n"; is not outputting nothing to console. I must be doing something wrong but I can't find what. Can any help me to build the proper batch query and to execute it each 1000 values append?

Proper output should be:

DEBUG count(data): 1454
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008ReolAAC'
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='005800000039SIWAA2'
....
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES(...), VALUES(...), VALUES(...)

Obtained result:

DEBUG count(data): 1454
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008RGg6AAG'
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt)
DEBUG: SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='00580000008RQ4CAAW'
DEBUG: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt)
.... // until reach 1454 results

The table is empty so it should never goes through ELSE condition (UPDATE one).

EDIT

With help from the answer this is how the code looks now:

$data = convertCsvToArray($fileName);
echo "DEBUG count(data): ", count($data), "\n";

$i = 1;
$sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES";

foreach ($data as $row) {
    $sql = "SELECT id,lastSyncAt FROM reps WHERE veeva_rep_id='{$row['Id']}'";
    $rs = $conn->query($sql);

    if ($rs === false) {
        echo 'Wrong SQL: '.$sql.' Error: '.$conn->error, E_USER_ERROR;
    } else {
        $rows_returned = $rs->num_rows;

        $veeva_rep_id = "'".$conn->real_escape_string($row['Id'])."'";
        $first = "'".$conn->real_escape_string(ucfirst(strtolower($row['FirstName'])))."'";
        $last = "'".$conn->real_escape_string(ucfirst(strtolower($row['LastName'])))."'";
        $email = "'".$conn->real_escape_string($row['Email'])."'";
        $username = "'".$conn->real_escape_string($row['Username'])."'";
        $display_name = "'".$conn->real_escape_string(
                ucfirst(strtolower($row['FirstName'])).' '.ucfirst(strtolower($row['LastName']))
            )."'";

        if ($rows_returned === 0) {
            if ($i % 1000 === 0) {
                file_put_contents("output.log", $sqlInsert."\n", FILE_APPEND);
                $sqlInsert = "INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES";
            } else {
                $sqlInsert .= "($veeva_rep_id,$first,$last,$email,$username,NOW(),NOW(),$display_name,'VEEVA','https://pdone.s3.amazonaws.com/avatar/default_avatar.png',NOW(),NOW()), ";
            }

            $i++;
        } else {
            echo "UPDATE";
        }
    }
}

But still buggy because:

  • I have got a first empty INSERT query: INSERT INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES
  • I have got a second INSERT query with 1000 VALUES() append, but what happened with the rest? The remaining 454?

Can any give me another tip? Help?

解决方案

consider using INSERT IGNORE INTO table to check if the record already exists. How to 'insert if not exists' in MySQL? if you haven't already done so, make veeva_rep_id a PRIMARY key so the INSERT IGNORE will work

also check out using PDO for transactions, prepared statements and dynamically generating queries using PDO PDO Prepared Inserts multiple rows in single query

<?php

$sql = 'INSERT IGNORE INTO reps(veeva_rep_id,first,last,email,username,lastLoginAt,lastSyncAt,display_name,rep_type,avatar_url,createdAt,updatedAt) VALUES ';

$insertQuery = array();
$insertData = array();

/*

assuming the array from the csv is like this

$data = array(
    0 => array('name' => 'Robert', 'value' => 'some value'),
    1 => array('name' => 'Louise', 'value' => 'another value')
);
*/

foreach ($data as $row) {
    $insertQuery[] = '(:veeva_rep_id' . $n . ', :first' . $n . ', :last' . $n . ', :email' . $n . ', :username' . $n . ', :lastLoginAt' . $n . ', :lastSyncAt' . $n . ', :display_name' . $n . ', :rep_type' . $n . ', :avatar_url' . $n . ', :createdAt' . $n . ', :updatedAt' . $n . ')';
    $insertData['veeva_rep_id' . $n] = $row['name'];
    $insertData['first' . $n] = $row['value'];
    $insertData['last' . $n] = $row['name'];
    $insertData['email' . $n] = $row['value'];
    $insertData['username' . $n] = $row['name'];
    $insertData['lastLoginAt' . $n] = $row['value'];
    $insertData['lastSyncAt' . $n] = $row['value'];
    $insertData['display_name' . $n] = $row['name'];
    $insertData['rep_type' . $n] = $row['value'];
    $insertData['avatar_url' . $n] = $row['value'];
    $insertData['createdAt' . $n] = $row['name'];
    $insertData['updatedAt' . $n] = $row['value'];

    $n++;
}

$db->beginTransaction();

if (!empty($insertQuery) and count($insertQuery)>1000) {
    $sql .= implode(', ', $insertQuery);

    $stmt = $db->prepare($sql);
    $stmt->execute($insertData);
}

$db->commit();

print $sql . PHP_EOL;

let me know if it helps.

这篇关于为MySQL建立一个批量查询,每插入1000个项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆