如何使Postgres Copy忽略大txt文件的第一行 [英] How to make Postgres Copy ignore first line of large txt file

查看:699
本文介绍了如何使Postgres Copy忽略大txt文件的第一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很大的.txt文件〜9gb,我想将此txt文件加载到postgres中。第一行是标题,后跟所有数据。如果我直接postgres COPY数据,则标题将导致错误,表明数据类型与我的postgres表不匹配,因此我将需要以某种方式将其删除。

I have a fairly large .txt file ~9gb and I will like to load this txt file into postgres. The first row is the header, followed by all the data. If I postgres COPY the data directly, the header will cause an error that data type do not match with my postgres table, so I will need to remove it somehow.

示例数据:
ProjectId,MailId,MailCodeId,prospectid,listid,datemailed,金额,捐赠,zip,zip4,VectorMajor,VectorMinor,packageid,阶段,databaseid,amount2

Sample data: ProjectId,MailId,MailCodeId,prospectid,listid,datemailed,amount,donated,zip,zip4,VectorMajor,VectorMinor,packageid,phase,databaseid,amount2

15,53568419,89734,219906,15,2011-05-11 00:00:00,0,0,90720,2915,NonProfit,POLICY,230,3,1,0

16,84141863,87936,164657,243,2011-03-10 00:00:00,0,0,48362,2523,NonProfit,POLICY,1507,5,1,0

16,81442028,86632,15181625,243,2011-01-19 00:00:00,0,0,11501,2115,NonProfit,POLICY,1508,2,1,0

虽然postgres的COPY函数具有可以忽略第一行的标头设置,但仅适用于csv文件:

While the COPY function for postgres has the "header" setting that can ignore the first row, it only works for csv files:

copy training from 'C:/testCSV.csv' DELIMITER ',' csv header;

当我尝试在txt文件上运行上述代码时,出现错误:

when I try to run the code above on my txt file, it gets an error:

copy training from 'C:/testTXTFile.txt' DELIMITER ',' csv header
ERROR:  unquoted newline found in data
HINT:  Use quoted CSV field to represent newline.

我尝试添加 quote和 escape属性,但是该命令似乎并没有为txt文件工作:

I have tried adding "quote" and "escape" attributes but the command just won't seem to work for txt file:

copy training from 'C:/testTXTFile.txt' DELIMITER ',' csv header quote as E'"'  escape as E'\\N';
ERROR:  COPY escape must be a single one-byte character

或者,我考虑过运行Java或创建一个单独的标记表来删除第一行...但是这些解决方案既耗时又费时,我需要加载9gb的数据才能删除标题的第一行...还有其他解决方案可以轻松删除txt文件的第一行,以便我可以将数据加载到我的postgres数据库中吗?

Alternatively, I thought about running java or create a seperate stagging table to remove the first row...but these solutions are expansive and time consuming. I will need to load 9gb of data just to remove the first row of headers... are there other solutions out there to remove the first row of a txt file easily so that I can load the data into my postgres database?

推荐答案

使用HEADER选项和CSV选项:

Use HEADER option with CSV option:

\copy <table_name>  from '/source_file.csv' delimiter ',' CSV HEADER ;




HEADER
指定该文件包含一个标题行,其中包含文件中各列的名称。输出时,第一行包含表中的列名,输入时,第一行被忽略。仅当使用CSV格式时,才允许使用此选项。

HEADER Specifies that the file contains a header line with the names of each column in the file. On output, the first line contains the column names from the table, and on input, the first line is ignored. This option is allowed only when using CSV format.

这篇关于如何使Postgres Copy忽略大txt文件的第一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆