psql import .csv-双引号字段和单双引号值 [英] psql import .csv - Double Quoted fields and Single Double Quote Values

查看:91
本文介绍了psql import .csv-双引号字段和单双引号值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好,堆满了花,

奇怪的问题.我在使用psql命令行参数导入.csv文件时遇到问题...

Weird question. I am having trouble importing a .csv file using psql command line arguments...

.csv以逗号分隔,并且在其中包含逗号的单元格/字段周围有双引号.我遇到一个问题,其中一个单元格/字段具有用于英寸的单个双引号.因此,在下面的示例中,它认为最下面的两行都是一个单元格/字段.

The .csv is comma delimited and there are double quotes around cells/fields that have commas in them. I run into an issue where one of the cells/fields has a single double-quote that is being used for inches. So in the example below, it thinks the bottom two rows are all one cell/field.

我似乎找不到正确进行此导入的方法.我希望不必对文件本身进行更改,只需调整我的psql命令即可.

I can't seem to find a way to make this import correctly. I am hoping to not have to make changes to the file itself and just adjust my psql command.

Ex:
number, number, description  (Headers)
123,124,"description, description"
123,124,description, TV 55"
123,124,description, TV 50"

Command Ex:
\copy table FROM 'C:\Users\Desktop\folder\file.csv' CSV HEADER
\copy table FROM 'C:\Users\Desktop\folder\file.csv' WITH CSV HEADER QUOTE '"' ESCAPE '\' 

我注意到使用excel保存可以解决此问题... Excel将记录的格式设置为...

I've noticed saving using excel fixes the issue... Excel formats the records like...

number, number, description  (Headers)
123,124,"description, description"
123,124,"description, TV 55"""
123,124,"description, TV 50"""

我不想保存使用excel,因为我将数字转换为科学计数法,并且在excel中打开文件后立即将前导零删除了.

I don't want to save using excel though because I have numbers that are turned into scientific notation and leading zeros are dropped immediately upon opening the file in excel.

推荐答案

这很丑陋,但是您可以使用 \ copy表从'/path/to/file'CSV引用导入到单列表e'\ x01'分隔符e'\ x02',然后尝试使用正则表达式函数在SQL中修复它.这仅适用于较小的CSV,因为您在导入时正在复制单列表中的数据.

It's an ugly hack, but you can import into a single-column table with \copy table from '/path/to/file' CSV quote e'\x01' delimiter e'\x02' and then try to fix it in SQL with regex functions. This is only workable with reasonably small CSVs since you're duplicating the data in the single-column table while doing the import.

testdb=# create table import_data(t text);
CREATE TABLE
testdb=# \! cat /tmp/oof.csv
num0,num1,descrip
123,124,"description, description"
123,124,description, TV 55"
123,124,"description, TV 50""
testdb=# \copy import_data from /tmp/oof.csv csv header quote e'\x01' delimiter e'\x02'
COPY 3
testdb=# CREATE TABLE fixed AS
SELECT
  (regexp_split_to_array(t, ','))[1] num1,
  (regexp_split_to_array(t, ','))[2] num2,
  regexp_replace(
        regexp_replace(regexp_replace(t, '([^,]+,[^,]+),(.*)', '\2'),
                       '"(.*?)"', '\1'),
        '(.*)(")?', '\1\2') as descrip
FROM import_data;
SELECT 3
testdb=# select * from fixed;
 num1 | num2 |         descrip          
------+------+--------------------------
 123  | 124  | description, description
 123  | 124  | description, TV 55"
 123  | 124  | description, TV 50"
(3 rows)

这篇关于psql import .csv-双引号字段和单双引号值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆