如何使用Linux将大型csv拆分为多个小型csv? [英] how to split large csv into multiple small csv using linux?

查看:319
本文介绍了如何使用Linux将大型csv拆分为多个小型csv?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

需要使用php和linux按行将大csv文件分割成多个文件.

Need to split large csv file into multiple files by lines using php and linux.

CSV包含-

"id","name","address"
"1","abc","this is test address1 which having multiple  newline
separators."
"2","abc","this is test address2
which having multiple newline  separators"
"3","abc","this is test address3.
which having multiple
newline separators."

我使用了linux comand-split -l 5000 testfile.

I used linux comand - split -l 5000 testfile.

但是它无法以正确的格式拆分csv,因为在csv中,一个字段地址具有多个换行符,因此请从该行中使用拆分文件命令.

But it can not able to split csv in correct format because in csv there is one field address having multiple newline characters so command with split file from that line.

我也尝试过使用PHP:

I've also tried to use PHP:

$inputFile = 'filename.csv';
$outputFile = "outputfile";
$splitSize = 5000;
$in = fopen($inputFile, 'r'):
$header = fgetcsv($in);
$rowCount = 0;
$fileCount = 1;

while (!feof($in)) { 
    if (($rowCount % $splitSize) == 0) {
        if ($rowCount > 0) {
            fclose($out);
        }   
        $filename = $outputFile . $fileCount++;
        $out = fopen($filename .'.csv', 'w');
        chmod($filename,777);
        fputcsv($out, $header);
    }   
    $data = fgetcsv($in);
    if ($data) {
        fputcsv($out, $data);
        $rowCount++;
    }   
}
fclose($out);

如何解决此问题?

推荐答案

使用Ruby:

ruby -e 'require "csv"
        f = ARGV.shift
        CSV.foreach(f).with_index{ |e, i|
            File.write("#{f}.#{i}", CSV.generate_line(e, force_quotes: true))
        }' file.csv

Php:

<?php
    $inputFile = 'file.csv';
    $outputFile = 'file.out';
    $splitSize = 1;
    if (($in = fopen($inputFile, 'r'))) {
        $header = fgetcsv($in);
        $rowCount = 0;
        $fileCount = 0;
        while (($data = fgetcsv($in))) {
            if (($rowCount % $splitSize) == 0) {
                if ($rowCount > 0) {
                    fclose($out);
                }
                $filename = $outputFile . ++$fileCount . '.csv';
                $out = fopen($filename, 'w');
                chmod($filename, 755);
                fputcsv($out, $header);
            }
            fputcsv($out, $data);
            $rowCount++;
        }
        fclose($out);
    }
?>

这篇关于如何使用Linux将大型csv拆分为多个小型csv?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆