忽略带引号的行或获取DataWeave将读取无效的CSV [英] Ignore lines with quotes or get DataWeave will to read invalid CSV

查看:174
本文介绍了忽略带引号的行或获取DataWeave将读取无效的CSV的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用Mule的DataWeave组件来读取无效的CSV文件,或至少不符合 RFC 4180 。问题是有一些值包含引号,但字段不转义。例如,

  col1,col2,col3 
一,二二二,三
一,two,three

有一种方法可以轻松放松CSV解析器中的规则DataWeave使用它以便它将不以双引号开头的值视为非转义值?或者,我可以(使用DataWeave或其他转换)忽略其中有引号的所有文本行?它小于行的一小部分,这些行偶然与此集成无关,但我不能控制CSV生成。



edit:下面是一个示例:



CSV

  A列,B列,C列,D列
A,有点奇怪,C,D
A,B,Something ElseWeird,D ,
A,,S,o,m,e,t,h,i,n,g, / code>



DataWeave



 %dw 1.0 
%输入有效负载应用程序/ csv
%输出应用程序/ json
---
有效负载

输出



  [
{
A栏:A,
B栏:,C,D \r\\\
A,B, bColumn C:D,
D列:
},
{
A列:A,
列B:,S,o,m,e,t,h,i,n,g,N,o,r,m,a,l, ,
D列:D
}
]


$


< b $ b

当然。只需从输入中删除包含双引号的所有行,之前您的DataWeave变换器。


I'm trying to use Mule's DataWeave component to read a CSV file that isn't valid, or at least doesn't conform to RFC 4180. The issue is that there are some values that contain quotes, but the field isn't escaped. For example,

col1,col2,col3
one,two "two" two,three
one",two,three

Is there a way to straightforward way to slightly relax the rules in the CSV parser that DataWeave uses so that it will treat a value that does not start with a double-quote as a non-escaped value? Alternatively, can I (either using DataWeave or some other transformation) ignore all lines of text that have a quote in them? It's less than a fraction of one percent of the rows, and those rows by chance aren't relevant for this integration anyway, but I can't control the CSV generation.

edit: Here's an example:

CSV

Column A,Column B,Column C,Column D
A,Something Weird",C,D
A,B,Something Else" Weird,D,
A,",S,o,m,e,t,h,i,n,g, ,N,o,r,m,a,l,",C,D

DataWeave

%dw 1.0
%input payload application/csv
%output application/json
---
payload

Output

[
  {
    "Column A": "A",
    "Column B": ",C,D\r\nA,B,Something Else",
    "Column C": "D",
    "Column D": ""
  },
  {
    "Column A": "A",
    "Column B": ",S,o,m,e,t,h,i,n,g, ,N,o,r,m,a,l,",
    "Column C": "C",
    "Column D": "D"
  }
]

解决方案

Alternatively, can I (either using DataWeave or some other transformation) ignore all lines of text that have a quote in them?

Sure. Just remove all lines containing a double-quote from the input, before your DataWeave transformer.

这篇关于忽略带引号的行或获取DataWeave将读取无效的CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆