跳过行的PostgreSQL csv导入 [英] Postgresql csv importation that skips rows
问题描述
我有一个Postgresql脚本,可以自动将csv文件导入到我的数据库中。该脚本可以检测到重复记录并删除它们,进行适当的更新,但仍然无法解决所有问题。基本上,csv文件是从其他系统导出的,这些系统在文件的开头和结尾附加了额外的信息,例如:
I have a Postgresql script that automatically imports csv files into my database. The script can detect duplicate records and remove them, do a proper upsert but still cannot tackle everything. Basically the csv files are exported from other systems which append at the beginning and end of the file extra information e.g:
Total Count: 2956
Avg Time: 13ms
Column1, Column2, Column3
... ... ...
我要跳过的是这些初始行或文件底部的任何行。有什么办法可以通过COPY或其他途径在Postgresql中做到这一点?我可以通过Postgresql调用实例操作系统命令吗?
What I want to do is skip those initial rows or any rows at the bottom of the file. Is there any way I can do this in Postgresql via COPY or via another route whatever that might be? Can I call for instance operating system commands via Postgresql?
推荐答案
对于Linux使用 tail
和 head
来裁剪文件并将其通过管道传输到脚本中:
For Linux use tail
and head
to crop the file and pipe it to your script:
tail -n +3 file.csv | head -1 | psql -f my_script.sql my_database
然后您的脚本将从STDIN复制:
Then your script will copy from STDIN:
copy my_table from STDIN;
这篇关于跳过行的PostgreSQL csv导入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!