固定宽度的数据到postgres [英] fixed width data into postgres
问题描述
寻找一种将FIXED-Width数据加载到postgres表中的好方法.我这样做是sas和python,而不是postgres.我猜没有本地方法.这些文件只有几个GB.我所见的一种方法由于某种原因(可能是内存问题)而无法在我的文件上运行.在那里,您作为一个大列加载,然后解析为表.我可以使用psycopy2,但由于内存问题,宁愿不使用.任何有效的想法或工具.pgloader运作良好还是有本机方法?
谢谢
在PostgreSQL中没有方便的内置方法来提取固定宽度的表格数据.我建议使用Pentaho Kettle或Talend Studio之类的工具进行数据加载,因为它们擅长使用许多不同的文件格式.我不记得 pg_bulkload
是否支持固定宽度,但是怀疑不支持.
或者,您通常可以使用诸如Python和 psycopg2
模块之类的代码编写一个简单的脚本,逐行加载固定宽度的数据并将其发送到PostgreSQL. psycopg2
通过 copy_from
对 COPY
命令的支持使其效率大大提高.在快速搜索中,我没有找到方便的Python定宽文件读取器,但我确定它们在那里.无论如何,您都可以使用喜欢的任何语言-Perl的 DBI
和 DBD :: Pg
也可以使用,并且Perl有数百万个固定宽度的文件读取器模块./p>
Looking for good way to load FIXED-Width data into postgres tables. I do this is sas and python not postgres. I guess there is not a native method. The files are a few GB. The one way I have seen does not work on my file for some reason (possibly memory issues). There you load as one large column and then parse into tables. I can use psycopy2 but because of memory issues would rather not. Any ideas or tools that work. Does pgloader work well or are there native methods?
Thanks
There's no convenient built-in method to ingest fixed-width tabular data in PostgreSQL. I suggest using a tool like Pentaho Kettle or Talend Studio to do the data-loading, as they're good at consuming many different file formats. I don't remember if pg_bulkload
supports fixed-width, but suspect not.
Alternately, you can generally write a simple script with something like Python and the psycopg2
module, loading the fixed-width data row by row and sending that to PostgreSQL. psycopg2
's support for the COPY
command via copy_from
makes this vastly more efficient. I didn't find a convenient fixed-width file reader for Python in a quick search but I'm sure they're out there. You can use whatever language you like anyway - Perl's DBI
and DBD::Pg
do just as well, and there are millions of fixed-width file reader modules for Perl.
这篇关于固定宽度的数据到postgres的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!