如何编写标准CSV [英] How to write standard CSV

查看：160 发布时间：2020/11/2 4:35:25 apache-spark export-to-csv

本文介绍了如何编写标准CSV的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

读取标准CSV 文件非常简单，例如:

It is very simple to read a standard CSV file, for example:

 val t = spark.read.format("csv")
 .option("inferSchema", "true")
 .option("header", "true")
 .load("file:///home/xyz/user/t.csv")

它读取一个真实的CSV文件，如

It reads a real CSV file, something as

   fieldName1,fieldName2,fieldName3
   aaa,bbb,ccc
   zzz,yyy,xxx

和t.show产生了预期的结果.

我需要相反的操作，写入标准CSV文件(而不是非标准文件的目录).

I need the inverse, to write standard CSV file (not a directory of non-standard files).

使用write时看不到相反的结果非常令人沮丧.也许有其他选择或某种format (" REAL csv please! ")存在.

It is very frustrating not to see the inverse result when write is used. Maybe some other option or some kind of format (" REAL csv please! ") exists.

我正在使用Spark v2.2 ，并在 Spark-shell 上运行测试.

I am using Spark v2.2 and running tests on Spark-shell.

read 的语法倒数"是 write ，因此预期会产生相同的文件格式.但是

The "syntatical inverse" of read is write, so is expected to produce same file format with it. But the result of

   t.write.format("csv").option("header", "true").save("file:///home/xyz/user/t-writed.csv")

不是原始格式t.csv的 rfc4180 标准格式的CSV文件，但是带有文件的t-writed.csv/文件夹 part-00000-66b020ca-2a16-41d9-ae0a-a6a8144c7dbc-c000.csv.deflate _SUCCESS 似乎是实木复合地板"，"ORC"或其他格式.

is not a CSV file of rfc4180 standard format, as the original t.csv, but a t-writed.csv/ folder with the file part-00000-66b020ca-2a16-41d9-ae0a-a6a8144c7dbc-c000.csv.deflate _SUCCESS that seems a "parquet", "ORC" or other format.

任何具有读东西"的完整工具包的语言都能够写东西"，这是一种正交性原理.

Any language with a complete kit of things that "read someting" is able to "write the something", it is a kind of orthogonality principle.

类似的问题或无法解决问题的链接，可能使用了不兼容的Spark版本，或者可能是 spark-shell 的限制来使用它.他们为专家提供了很好的线索:

Similar question or links that not solved the problem, perhaps used a incompatible Spark version, or perhaps spark-shell a limitation to use it. They have good clues for experts:

这是一个由@JochemKuijpers指出的类似问题类似的问题:我尝试提出建议，但得到了同样难看的结果.

This similar question pointed by @JochemKuijpers: I try suggestion but obtain same ugly result.

此链接说有解决方案(！)，但是我无法在spark-shell中复制/粘贴saveDfToCsv()(错误:未找到:键入DataFrame" )，有些线索吗?

This link say that there are a solution (!), but I can't copy/paste saveDfToCsv() in my spark-shell ("error: not found: type DataFrame"), some clue?

如何编写标准CSV [英] How to write standard CSV

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何编写标准CSV [英] How to write standard CSV

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭