从.CSV文件的数值中删除双引号和逗号 [英] Remove double quotes and comma from a numeric value of a .CSV file

查看:1577
本文介绍了从.CSV文件的数值中删除双引号和逗号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个.CSV文件,其中几乎没有带有数字的记录,这些记录用双引号引起来(例如,在"455,365.44"中用引号引起来),并在引号之间使用逗号.我需要从记录的数值中删除逗号("455,365.44"在处理后应类似于455365.44),以便可以在文件的进一步处理中使用它们.

I have a .CSV file which has few records with numbers in them which are enclosed in double quotes (such as in "455,365.44") and commas in between the quotes. I need to remove the comma from the numeric values("455,365.44" should look like 455365.44 after processing) of the records so I could use them in the further processing of the file.

这是文件的示例

column 1, column 2, column 3, column 4, column 5, column 6, column 7
12,"455,365.44","string with quotes, and with a comma in between","4,432",6787,890,88
432,"222,267.87","another, string with quotes, and with two comma in between","1,890",88,12,455
11,"4,324,653.22","simple string",77,777,333,22

我需要的结果如下:

column 1, column 2, column 3, column 4, column 5, column 6, column 7
12,455365.44,"string with quotes, and with a comma in between",4432,6787,890,88
432,222267.87,"another, string with quotes, and with two comma in between",1890,88,12,455
11,4324653.22,"simple string",77,777,333,22

P.S:我只需要像这样转换为数字的值,并且字符串值应保持不变.

P.S: I need only the values which are numeric to be converted like this and the string values should remain same.

请帮助...

推荐答案

要删除引号(用带引号的数字替换不带引号的数字):

To remove the quotes (replace the number with the quotes with the number without them):

s/"(\d[\d.,]*)"/\1/g

请参见珠光

对于逗号,如果正则表达式实现支持多数民众赞成,那么我只能想到先行和后退(如果前后引号内的数字都用逗号括住,请用逗号替换逗号):

For the commas I could only think of a lookahead and lookbehind, if thats supported by your regex implementation (replace commas with nothing if before and after is a number within quotes):

s/(?<="[\d,]+),(?=[\d,.]+")//g

在删除引号之前,您必须执行此操作.

You would have to execute this before removing the quotes.

它可能也可以在不隐藏的情况下工作:

It might also work without lookbehind:

s/,(?=[\d,.]*\d")//g

请参见珠光

在shell脚本中,您可能需要使用 perl ,例如执行:

In a shell script you might want use perl e.g. execute:

cat test.csv | perl -p -e 's/,(?=[\d,.]*\d")//g and s/"(\d[\d,.]*)"/\1/g'

正则表达式的解释:

首先执行:

s/,(?=[\d,.]*\d")//g 

这将删除所有后跟数字([\d,.]*\d)和引号的逗号,从而仅删除引号内数字的逗号

This will remove all commas that are followed by a number ([\d,.]*\d) and a quote, thus removing only commas from numbers within quotes

下一步执行

s/"(\d[\d,.]*)"/\1/g

这会将引号内的所有数字替换为不带引号的值

This will replace all numbers that are within quotes by the value without the quotes

这篇关于从.CSV文件的数值中删除双引号和逗号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆