使用csv上的文件指针 [英] Working with file pointers on a csv

查看:145
本文介绍了使用csv上的文件指针的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道如何更改下面的代码读取x行数只处理sql插入语句,然后继续通过x号读取文件,并处理直到文件结束。我是新的文件指针的想法,但我理解应该可能使用fgets。

I was wondering how to change the code below to read x number of lines only process the sql insert statement then continue to read the file by x number and process until end of file. I am new to the idea of file pointers but i understand it should be possible using fgets.

我希望将下面的代码变成一个函数,我可以通过文件名和要读取和处理的行数。

I'm hoping to change the below code into a function where I can pass the filename and the number of lines I want read and processed.

我目前有:
(来自此处

$handle = fopen(dirname(__FILE__)."/files/workorderstest.csv" , "r");

$batch++;

if ($handle) {
    $counter = 0;

    //instead of executing query one by one,
    //let us prepare 1 SQL query that will insert all values from the batch

    $sql ="INSERT INTO workorderstest(id,parentid,f1,f2,f3,f4,f5,f6,f7,f8,f9,f10) VALUES ";

    while (($line = fgets($handle)) !== false) {
       $sql .= "($line),";
       $counter++;
    }

    $sql = substr($sql, 0, strlen($sql) - 1);

    var_dump($sql);

    if ($conn->query($sql) === TRUE) {

    } else {

    }

    fclose($handle);
}

我想保持内存占用最小。我认为这应该只是一个跟踪指针的问题 - >重复直到行到达 - >进程sql - >开始指针 - >重复直到eof。

I want to keep the memory footprint to a minimum. I'm thinking this should just be a matter of keeping track of the pointer -> repeat until lines reached -> process sql -> start at pointer -> repeat until eof.


  1. fgets()最适合使用吗?

  2. 我需要回调或一些这样的延迟sql处理,直到所有行读取?

  3. 我有点失去了开始,因为我还在学习PHP。

  1. Is fgets() the best to use for this?
  2. Do I need to incorperate a callback or some such to defer the sql processing until all lines are read?
  3. I'm a bit lost on where to start as I am still learning PHP.

****如果它能帮助别人,则更新下面的回答脚本...

**** Updated Answered script below if it helps someone else ...

date_default_timezone_set('Australia/Brisbane');
$date = date('m/d/Y h:i:s a', time());
$timezone = date_default_timezone_get();
$time_start = microtime(true);

$batch_size = 500; // Lines to be read per batch
$batch = 0;
$counter = 0;
$lines = 0;

$conn = new mysqli($servername, $username, $password, $dbname);

if ($conn->connect_error) {
    die("Connection failed: " . $conn->connect_error);
}

// Remove Existing Data from table
$sql = "TRUNCATE TABLE  `workorderstest`";
$conn->query($sql);

$handle = fopen(dirname(__FILE__)."/files/workorders.csv" , "r");

//instead of executing query one by one,
//let us prepare 1 SQL query that will insert all values from the batch

$sql_prefix ="INSERT INTO workorderstest(id,parentid,f1,f2,f3,f4,f5,f6,f7,f8,f9,f10) VALUES ";
$values = "";

while (($line = fgets($handle)) !== false) {
    $values .= "($line),";
    $counter++;
    $lines++;
    if ($counter == $batch_size) {
        $values = substr($values, 0, strlen($values) - 1);
        $conn->query($sql_prefix . $values) or die($conn->error);
        $counter = 0;
        $values ="";
        $batch++;
    }
}
if ($counter > 0) { // Execute the last batch
    $values = substr($values, 0, strlen($values) - 1);
    $conn->query($sql_prefix . $values) or die($conn->error);
}

// Output results
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Importing Script running at: $date <br/>";
echo "Timezone: $timezone <br/>";
echo "<br/>";
echo "Script Summary:";
echo "Time running script: " . round($time,3) . " seconds <br/>";
echo "Memory: ".memory_get_usage() . " bytes <br/>";
echo "Records Updated: $lines <br/>";
echo "Batches run: $batch <br/>";

?>


推荐答案


  1. em> fgets()是否最适合使用?这是一个很好的方法。另一个选项是用 file()将整个文件读入数组,然后使用 foreach()循环遍历数组

  1. Is fgets() the best to use for this? It's a fine way to do it. Another option is to read the entire file into an array with file(), then loop over the array with foreach().

我需要强制回调吗?不需要。从文件中读取每一行之后执行查询

Do I need to incorperate a callback? No. Just perform the query after reading every batch of lines from the file.

何时开始?当计数器达到批量大小时,请执行查询。然后将计数器设置为 0 ,并将查询字符串设置为初始值。最后,在循环结束时,您需要使用剩余值执行查询(除非文件大小是批量大小的确切倍数,在这种情况下不会有任何剩余)。

Where to start? When the counter reaches the batch size, perform the query. Then set the counter back to 0 and set the query string back to the initial value. Finally, at the end of the loop you'll need to perform the query with the remaining values (unless the file size was an exact multiple of the batch size, in which case there won't be anything remaining).



$batch_size = 100;
$counter = 0;

//instead of executing query one by one,
//let us prepare 1 SQL query that will insert all values from the batch

$sql_prefix ="INSERT INTO workorderstest(id,parentid,f1,f2,f3,f4,f5,f6,f7,f8,f9,f10) VALUES ";
$values = "";

while (($line = fgets($handle)) !== false) {
    $values .= "($line),";
    $counter++;
    if ($counter == $batch_size) {
        $values = substr($values, 0, strlen($values) - 1);
        $conn->query($sql_prefix . $values) or die($conn->error);
        $counter = 0;
        $values ="";
    }
}
if ($counter > 0) { // Execute the last batch
    $values = substr($values, 0, strlen($values) - 1);
    $conn->query($sql_prefix . $values) or die($conn->error);
}

这篇关于使用csv上的文件指针的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆