SEd:用单个逗号替换空格字符,引号内除外 [英] SEd: replace whitespace characters with single comma except inside quotes

查看:56
本文介绍了SEd:用单个逗号替换空格字符,引号内除外的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这一行来自汽车数据集(https://archive.ics.uci.edu/ml/datasets/Auto+MPG)看起来像这样:

This line is from a car dataset (https://archive.ics.uci.edu/ml/datasets/Auto+MPG) looking like this:

15.0   8.   429.0      198.0      4341.      10.0   70.  1.     "ford galaxie 500"

如何用一个逗号替换多个空格(它有空格和制表符),但不在引号内,最好使用 sed,将数据集转换为真正的 csv.谢谢!

how would one replace the multiple whitespace (it has both space and tabs) w/ a single comma, but not inside the quotes, preferably using sed,to turn the dataset into a REAL csv. Thanks!

推荐答案

用 awk 来做:

awk -F'"' 'BEGIN { OFS="\"" } { for(i = 1; i <= NF; i += 2) { gsub(/[ \t]+/, ",", $i); } print }' filename.csv

使用 " 作为字段分隔符,每隔一个字段将成为应替换空格的行的一部分.然后:

Using " as the field separator, every second field is going to be a part of the line where spaces should be replaced. Then:

BEGIN { OFS = FS }               # output should also be separated by "
{
  for(i = 1; i <= NF; i += 2) {  # in every second field
    gsub(/[ \t]+/, ",", $i)      # replace spaces with commas
  }
  print                          # and print the whole shebang
}

这篇关于SEd:用单个逗号替换空格字符,引号内除外的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆