通过命令行插入 SQL 语句而不重新打开与远程数据库的连接 [英] Insert SQL statements via command line without reopening connection to remote database

查看:20
本文介绍了通过命令行插入 SQL 语句而不重新打开与远程数据库的连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大量数据文件要处理并存储在远程数据库中.数据文件的每一行代表数据库中的一行,但必须在插入数据库之前进行格式化.

I have a large amount of data files to process and to be stored in the remote database. Each line of a data file represents a row in the database, but must be formatted before inserting into the database.

我的第一个解决方案是通过编写 bash 脚本处理数据文件并生成 SQL 数据文件,然后将转储 SQL 文件导入数据库.这个解决方案似乎太慢了,正如您所见,涉及创建中间 SQL 文件的额外步骤.

My first solution was to process data files by writing bash scripts and produce SQL data files, and then import the dump SQL files into the database. This solution seems to be too slow and as you can see involves an extra step of creating intermediary SQL file.

我的第二个解决方案是编写 bash 脚本,在处理数据文件的每一行时,创建和 INSERT INTO ... 语句并将 SQL 语句发送到远程数据库:

My second solution was to write bash scripts that while processing each line of the data file, creates and INSERT INTO ... statement and sends the SQL statement to the remote database:

回显 sql_statement |psql -h remote_server -U 用户名 -d 数据库

即不创建 SQL 文件.然而,这个解决方案有一个主要问题,我正在寻找一个建议:
每次我必须重新连接到远程数据库才能插入一行.

i.e. does not create SQL file. This solution, however, has one major issue that I am searching an advice on:
Each time I have to reconnect to the remote database to insert one single row.

有没有办法连接到远程数据库,保持连接状态,然后管道"或发送"插入 SQL 语句而不创建巨大的 SQL 文件?

Is there a way to connect to the remote database, stay connected and then "pipe" or "send" the insert-SQL-statement without creating a huge SQL file?

推荐答案

回答您的实际问题

是的.您可以使用 命名管道 而不是创建文件.考虑以下演示.

Answer to your actual question

Yes. You can use a named pipe instead of creating a file. Consider the following demo.

在我的数据库 event 中创建一个模式 x 用于测试:

Create a schema x in my database event for testing:

-- DROP SCHEMA x CASCADE;
CREATE SCHEMA x;
CREATE TABLE x.x (id int, a text);

从 shell 创建一个命名管道 (fifo),如下所示:

Create a named pipe (fifo) from the shell like this:

postgres@db:~$ mkfifo --mode=0666 /tmp/myPipe

1)使用命名管道在服务器上调用SQL命令COPY:

Either 1) call the SQL command COPY using a named pipe on the server:

postgres@db:~$ psql event -p5433 -c "COPY x.x FROM '/tmp/myPipe'"

这将在数据库中的表 x.x 上获得一个排他锁.连接保持打开状态,直到 fifo 获取数据.小心不要让这个打开太久!您可以在填充管道后调用它,以最大限度地减少阻塞时间.您可以选择事件的顺序.该命令在两个进程绑定到管道后立即执行.第一个等待第二个.

This will acquire an exclusive lock on the table x.x in the database. The connection stays open until the fifo gets data. Be careful not to leave this open for too long! You can call this after you have filled the pipe to minimize blocking time. You can chose the sequence of events. The command executes as soon as two processes bind to the pipe. The first waits for the second.

或者2)你可以从客户端上的管道执行SQL:

Or 2) you can execute SQL from the pipe on the client:

postgres@db:~$ psql event -p5433 -f /tmp/myPipe

这更适合您的情况.另外,在 SQL 被一次性执行之前不会锁表.

This is better suited for your case. Also, no table locks until SQL is executed in one piece.

Bash 将显示为被阻止.它正在等待输入到管道.要从一个 bash 实例完成所有操作,您可以将等待进程发送到后台.像这样:

Bash will appear blocked. It is waiting for input to the pipe. To do it all from one bash instance, you can send the waiting process to the background instead. Like this:

postgres@db:~$ psql event -p5433 -f /tmp/myPipe 2>&1 &

<小时>

无论哪种方式,从同一个 bash 或不同的实例,您现在都可以填充管道.
变体 1) 的三行演示:

postgres@db:~$ echo '1  foo' >> /tmp/myPipe; echo '2    bar' >> /tmp/myPipe; echo '3    baz' >> /tmp/myPipe;

(注意使用制表符作为分隔符或指示 COPY 使用 WITH DELIMITER 'delimiter_character' 接受不同的分隔符)
这将使用 COPY 命令触发挂起的 psql 执行并返回:

(Take care to use tabs as delimiters or instruct COPY to accept a different delimiter using WITH DELIMITER 'delimiter_character')
That will trigger the pending psql with the COPY command to execute and return:

COPY 3

变体 2) 的演示:

postgres@db:~$ (echo -n "INSERT INTO x.x VALUES (1,'foo')" >> /tmp/myPipe; echo -n ",(2,'bar')" >> /tmp/myPipe; echo ",(3,'baz')" >> /tmp/myPipe;)

INSERT 0 3

完成后删除命名管道:

postgres@db:~$ rm /tmp/myPipe

检查成功:

event=# select * from x.x;
 id |         a
----+-------------------
  1 | foo
  2 | bar
  3 | baz

以上代码的有用链接

读取压缩文件使用命名管道的 postgres
命名管道简介
在后台运行 bash 脚本的最佳实践

对于批量 INSERT 你有比单独的 INSERT 每行.使用此语法变体:

For bulk INSERT you have better solutions than a separate INSERT per row. Use this syntax variant:

INSERT INTO mytable (col1, col2, col3) VALUES
 (1, 'foo', 'bar')
,(2, 'goo', 'gar')
,(3, 'hoo', 'har')
...
;

将您的语句写入文件并像这样执行大量 INSERT:

Write your statements to a file and do one mass INSERT like this:

psql -h remote_server -U username -d database -p 5432 -f my_insert_file.sql

(5432 或 db-cluster 正在侦听的任何端口)
my_insert_file.sql 可以包含多个 SQL 语句.事实上,像这样恢复/部署整个数据库是很常见的做法.请参阅手册了解-f 参数,或在 bash 中:man psql.

(5432 or whatever port the db-cluster is listening on)
my_insert_file.sql can hold multiple SQL statements. In fact, it's common practise to restore / deploy whole databases like that. Consult the manual about the -f parameter, or in bash: man psql.

或者,如果您可以将(压缩的)文件传输到服务器,您可以使用 COPY 以更快地插入(解压缩的)数据.

Or, if you can transfer the (compressed) file to the server, you can use COPY to insert the (decompressed) data even faster.

您也可以在 PostgreSQL 中进行部分或全部处理.为此,您可以 COPY TO(或 INSERT INTO)一个临时表并使用普通 SQL 语句来准备并最终插入/更新您的表.我经常这样做.请注意,临时表在会话中生死攸关.

You can also do some or all of the processing inside PostgreSQL. For that you can COPY TO (or INSERT INTO) a temporary table and use plain SQL statements to prepare and finally INSERT / UPDATE your tables. I do that a lot. Be aware that temporary tables live and die with the session.

您可以使用像 pgAdmin 这样的 GUI 来进行舒适的处理.SQL 编辑器窗口中的会话保持打开状态,直到您关闭该窗口.(因此,临时表会一直存在,直到您关闭窗口.)

You could use a GUI like pgAdmin for comfortable handling. A session in an SQL Editor window remains open until you close the window. (Therefore, temporary tables live until you close the window.)

这篇关于通过命令行插入 SQL 语句而不重新打开与远程数据库的连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆