在powershell中的列中处理带换行符的CSV [英] handling a CSV with line feed characters in a column in powershell
问题描述
目前,我有一个系统,创建一个分隔文件,像下面的一个,我已经嘲笑了额外的行馈送,在列中偶尔。
Column1,Column2,Column3,Column4
Text1,Text2 [LF],[LF],text3,text4 [CR] ] [LF]
Text1,Text2,text3 [LF] [LF],text4 [CR]
Text1,Text2,text3 [LF],text4 [LF] [LF] [CR] [LF]
b $ b
我已经能够通过使用Notepad ++使用以下REGEX来忽略有效的回车/换行组合,从而删除换行符:
(?<![\r])[\\\
]
使用powershell找到一个解决方案,因为我想当我得到的csv文件的内容,文本字段中的换行符被忽略,该值作为单独的对象存储在分配给get-content操作的变量中。我的问题是如何应用正则表达式csv文件使用replace如果cmdlet在加载数据时忽略换行符?
我也试过下面的方法下面加载我的csv的内容不工作,因为它只是导致一个长字符串,这将类似于使用-join(get-content)。
[STRING] $ test = [io.file] :: ReadAllLines('C:\CONV\DataOutput.csv')
$ test.replace ?<![\r])[\\\
],)
$ test | out-file .\DataOutput_2.csv
:
- 使用
ReadAllText(...)
insteadReadAllLines(...)
- 使用
-replace ...
而不是.Replace(...)
,那么只有第一个参数才会被视为 regex $ test ) - >
示例代码:
[STRING] $ test = [io.file] :: ReadAllText('C:\CONV\DataOutput.csv')
$ test = $ test -replace'(?<![\r])[ \\ n]',''
$ test | out-file .\DataOutput_2.csv
Currently, I have a system which creates a delimited file like the one below in which I've mocked up the extra line feeds which are within the columns sporadically.
Column1,Column2,Column3,Column4
Text1,Text2[LF],text3[LF],text4[CR][LF]
Text1,Text2[LF][LF],text3,text4[CR][LF]
Text1,Text2,text3[LF][LF],text4[CR][LF]
Text1,Text2,text3[LF],text4[LF][LF][CR][LF]
I've been able to remove the line feeds causing me concern by using Notepad++ using the following REGEX to ignore the valid carriage return/Line feed combinations:
(?<![\r])[\n]
I am unable however to find a solution using powershell, because I think when I get-content for the csv file the line feeds within the text fields are ignored and the value is stored as a separate object in the variable assigned to the get-content action. My question is how can I apply the regex to the csv file using replace if the cmdlet ignores the line feeds when loading the data?
I've also tried the following method below to load the content of my csv which doesn't work either as it just results in one long string, which would be similar to using -join(get-content).
[STRING]$test = [io.file]::ReadAllLines('C:\CONV\DataOutput.csv')
$test.replace("(?<![\r])[\n]","")
$test | out-file .\DataOutput_2.csv
Nearly there, may I suggest just 3 changes:
- use
ReadAllText(…)
instead ofReadAllLines(…)
- use
-replace …
instead of.Replace(…)
, only then will the first argument be treated as a regex - do something with the replacement result (e.g. assign it back to
$test
)
Sample code:
[STRING]$test = [io.file]::ReadAllText('C:\CONV\DataOutput.csv')
$test = $test -replace '(?<![\r])[\n]',''
$test | out-file .\DataOutput_2.csv
这篇关于在powershell中的列中处理带换行符的CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!