如何展开将 CSV 文件转换为以空格分隔的文件?标量火花 [英] How to expand converts a CSV file to a space-delimited file? Scalar spark
本文介绍了如何展开将 CSV 文件转换为以空格分隔的文件?标量火花的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个 CSV
文件.这是我的输入:
I has a CSV
file.
This is my Input:
,"",3,"a_b","cde
f\gh","i j","k,""l"
现在,我想将 CSV
文件转换为以空格分隔的文件.我该怎么办?
Now, I want to convert CSV
file to a space-delimited file. What should I do?
这是规格:
- 被识别为逗号分隔的数据包括字符串 0(未用双引号括起来)和字符串 1"(用双引号括起来).
- 空字符串 0 转换为 0,空字符串 1 是转换为
"_"
.(-z
选项更改 string0 中的0
,-n
选项更改字符串 1) 中的 - 字符串 1 中转义的双引号被转换为单个
"
.不能在字符串 0 中使用双引号. - 任何字符串中的半角空格都被转换为
"_"
(-s
选项改变_
) -e
选项在"_"
之前(或由-s
选项指定的字符)和"\"
by"\"
.-q
选项消除前面的"\"
形式的"\""
和"\\"
.\r\n
在行尾自动转换为\n
.- 字符串 1 中的任何
\n
都被转换为"\n"
. - 最后一行不需要换行符 (
\n
).
_
- Data that is recognized as comma-delimited includes string 0 (not enclosed in double-quotes) and "string 1" (enclosed in double quotes).
- Empty string 0 is converted to 0, empty-string 1 is
converted to
"_"
. (-z
option changes0
in string0,-n
option changes_
in string 1) - Escaped double quotes inside string 1 are converted to single
"
. You cannot use double quotes in string 0. - Half-width spaces inside any string are converted to
"_"
(-s
option changes_
) -e
option precedes"_"
(or the character specified by-s
option) and"\"
by"\"
.-q
option eliminate preceding"\"
form"\""
and"\\"
.\r\n
at the end of a line is automatically converted to\n
.- Any
\n
inside string 1 is converted to"\n"
. - The final line does not require a linefeed (
\n
).
我想获得如下所需的输出结果.请帮帮我.
I want to have the desired output result as below. Please help me.
0 _ 3 a\_b cde\nf\\gh i_j k,"l
推荐答案
你可以使用 itto-csv https://github.com/gekomad/itto-csv 标记 csv
You could use itto-csv https://github.com/gekomad/itto-csv to tokenize the csv
implicit val csvFormat: com.github.gekomad.ittocsv.parser.IttoCSVFormat = com.github.gekomad.ittocsv.parser.IttoCSVFormat.default
import com.github.gekomad.ittocsv.util.StringUtils._
val csvString = "1,foo"
val stringList = tokenizeCsvLine(csvString) // Some(List("1", "foo")))
并将您的规范应用于 stringList
and apply your specifications to stringList
stringList.getOrElse(???).map(field => yourSpec(field))
这篇关于如何展开将 CSV 文件转换为以空格分隔的文件?标量火花的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文