将木地板转换为CSV [英] Convert Parquet to CSV
问题描述
如何从本地文件系统(例如python,某些库等)将Parquet转换为CSV,但是没有Spark? (试图找到尽可能简单和极简的解决方案,因为需要使所有内容自动化,并且不需要太多资源).
How to convert Parquet to CSV from a local file system (e.g. python, some library etc.) but WITHOUT Spark? (trying to find as simple and minimalistic solution as possible because need to automate everything and not much resources).
我尝试过在我的Mac上为parquet-tools
,但是数据输出看起来不正确.
I tried with e.g. parquet-tools
on my Mac but data output did not look correct.
需要进行输出,以便当某些列中不存在数据时-CSV将具有对应的NULL(2个逗号之间的空列)..
Need to make output so that when data is not present in some columns - CSV will have corresponding NULL (empty column between 2 commas)..
谢谢.
推荐答案
您可以使用Python软件包pandas
和pyarrow
(pyarrow
是pandas
的可选依赖项,为此,功能).
You can do this by using the Python packages pandas
and pyarrow
(pyarrow
is an optional dependency of pandas
that you need for this feature).
import pandas as pd
df = pd.read_parquet('filename.parquet')
df.to_csv('filename.csv')
当需要修改文件中的内容时,可以在df
上进行标准的pandas
操作.
When you need to make modifications to the contents in the file, you can standard pandas
operations on df
.
这篇关于将木地板转换为CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!