从CSV数据值中删除回车 [英] Remove carriage returns from CSV data value
问题描述
我正在使用LOAD DATA INFILE语句将数据从管道分隔的CSV导入MySQL.我通过使用'\ r \ n'终止行.我的问题是,每行中的某些数据中都有"\ r \ n",从而导致加载出错.我有类似的文件,它们仅在数据中使用'\ n'来表示换行符,并且不会引起任何问题.
I am importing data from a pipe-delimited CSV to MySQL using a LOAD DATA INFILE statement. I am terminating lines by using '\r\n'. My problem is that some of the data within each row has '\r\n' in it, causing the load to error. I have similar files that just use '\n' within data to indicate linebreaks, and that causes no issues.
良好CSV示例
School|City|State|Country\r
Harvard University|Cambridge|MA|USA\r
Princeton University|Princeton|New
Jersey
|USA\r
示例不良CSV
School|City|State|Country\r
Harvard University|Cambridge|MA|USA\r
Princeton University|Princeton|New\r
Jersey\r
|USA\r
是否可以使用sed,awk或perl预处理CSV,以清除列值中多余的回车符?
Is there a way to pre-process the CSV, using sed, awk, or perl, to clean up the extra carriage return in the column values?
推荐答案
这是perl中的一种可能的解决方案.它读取一行,并且如果少于4个字段,它将继续读取下一行并将其合并,直到它确实具有4个字段为止.只需将$number_of_fields
的值更改为正确的数字即可.
This is one possible solution in perl. It reads in a line and if there are less than 4 fields, it keeps reading in the next line and merging it until it does have 4 fields. Just change the value of $number_of_fields
to the right number.
#!/usr/bin/perl
use strict;
use warnings;
my $number_of_fields=4;
while(<STDIN>)
{
s/[\r\n]//g;
my @fields=split(/\|/);
next if($#fields==-1);
while($#fields<$number_of_fields-1)
{
my $nextline=<STDIN> || last;
$nextline =~ s/[\r\n]//g;
my @tmpfields=split(/\|/,$nextline);
next if($#tmpfields==-1);
$fields[$#fields] .= "\n".$tmpfields[0];
shift @tmpfields;
push @fields,@tmpfields;
}
print join("|",@fields),"\r\n";
}
这篇关于从CSV数据值中删除回车的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!