在powershell中的列中处理带换行符的CSV [英] handling a CSV with line feed characters in a column in powershell

查看:891
本文介绍了在powershell中的列中处理带换行符的CSV的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前,我有一个系统,创建一个分隔文件,像下面的一个,我已经嘲笑了额外的行馈送,在列中偶尔。



Column1,Column2,Column3,Column4



Text1,Text2 [LF],[LF],text3,text4 [CR] ] [LF]



Text1,Text2,text3 [LF] [LF],text4 [CR]

Text1,Text2,text3 [LF],text4 [LF] [LF] [CR] [LF]




b $ b

我已经能够通过使用Notepad ++使用以下REGEX来忽略有效的回车/换行组合,从而删除换行符:



(?<![\r])[\\\
]



使用powershell找到一个解决方案,因为我想当我得到的csv文件的内容,文本字段中的换行符被忽略,该值作为单独的对象存储在分配给get-content操作的变量中。我的问题是如何应用正则表达式csv文件使用replace如果cmdlet在加载数据时忽略换行符?



我也试过下面的方法下面加载我的csv的内容不工作,因为它只是导致一个长字符串,这将类似于使用-join(get-content)。



[STRING] $ test = [io.file] :: ReadAllLines('C:\CONV\DataOutput.csv')

$ test.replace ?<![\r])[\\\
],)

$ test | out-file .\DataOutput_2.csv

解决方案




  • 使用 ReadAllText(...) instead ReadAllLines(...)

  • 使用 -replace ... 而不是 .Replace(...),那么只有第一个参数才会被视为 regex $ test )

  • >


示例代码:

  [STRING] $ test = [io.file] :: ReadAllText('C:\CONV\DataOutput.csv')
$ test = $ test -replace'(?<![\r])[ \\ n]',''
$ test | out-file .\DataOutput_2.csv


Currently, I have a system which creates a delimited file like the one below in which I've mocked up the extra line feeds which are within the columns sporadically.

Column1,Column2,Column3,Column4

Text1,Text2[LF],text3[LF],text4[CR][LF]

Text1,Text2[LF][LF],text3,text4[CR][LF]

Text1,Text2,text3[LF][LF],text4[CR][LF]

Text1,Text2,text3[LF],text4[LF][LF][CR][LF]

I've been able to remove the line feeds causing me concern by using Notepad++ using the following REGEX to ignore the valid carriage return/Line feed combinations:

(?<![\r])[\n]

I am unable however to find a solution using powershell, because I think when I get-content for the csv file the line feeds within the text fields are ignored and the value is stored as a separate object in the variable assigned to the get-content action. My question is how can I apply the regex to the csv file using replace if the cmdlet ignores the line feeds when loading the data?

I've also tried the following method below to load the content of my csv which doesn't work either as it just results in one long string, which would be similar to using -join(get-content).

[STRING]$test = [io.file]::ReadAllLines('C:\CONV\DataOutput.csv')
$test.replace("(?<![\r])[\n]","")
$test | out-file .\DataOutput_2.csv

解决方案

Nearly there, may I suggest just 3 changes:

  • use ReadAllText(…) instead of ReadAllLines(…)
  • use -replace … instead of .Replace(…), only then will the first argument be treated as a regex
  • do something with the replacement result (e.g. assign it back to $test)

Sample code:

[STRING]$test = [io.file]::ReadAllText('C:\CONV\DataOutput.csv')
$test = $test -replace '(?<![\r])[\n]',''
$test | out-file .\DataOutput_2.csv

这篇关于在powershell中的列中处理带换行符的CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆