选择瓶颈并插入选择对蟑螂数据库无效 [英] select bottleneck and insert into select doesn't work on cockroach db
问题描述
我必须合并2个表,如下所示。 table2具有15GB的数据。但是它显示错误。我设置了max-sql-memory = .80,但我不知道该如何解决。
当我执行此查询有50000个限制选项时,它可以工作!
即使 select * from table2也显示相同的错误。
我认为某种程度上存在选择瓶颈....
另外,使用此查询,通常只有3个节点的延迟中有1个上升。 (AWS EC2 i3.xlarge类型)
▶查询
插入表1中(
InvoiceID,PayerAccountId,LinkedAccountId,RecordType,RecordId, ProductName
)
从表2中选择
InvoiceID,PayerAccountId,LinkedAccountId,RecordType,RecordId,ProductName
;
▶错误:
驱动程序:连接错误
警告:连接丢失!
打开新连接:所有会话设置都将丢失
▶日志:
W180919 04:59:20.452985 186 storage / raft_transport.go:465 [n3]到节点2的筏传输流失败:rpc错误:代码=不可用desc =传输正在关闭
W180919 04:59:20.452996 190 vendor / google.golang.org / grpc / clientconn.go:1158 grpc: addrConn.createTransport无法连接到{10.240.98.xxx:26257 0}。错误:连接错误:desc =传输:拨号时无法重用客户端连接时出错。重新连接...
如果我正确地理解了您的问题,则您正在使用一条语句读取〜15GB的 table2
中的数据并将其插入 table1
中。不幸的是,您发现这行不通。参见单个语句的限制涵盖了这种情况。设置-max-sql-memory = .80
将无济于事,并且很可能会造成伤害,因为CockroachDB需要一些喘息的空间,因为我们的内存跟踪不准确。 连接错误警告和您在日志中发现的错误都是Cockroach进程崩溃(可能是由于内存不足)而发生的症状。
如果您需要将数据从 table2
复制到 table1
,那么此时您有点不走运。虽然您可以尝试使用显式事务并将单个 INSERT
语句拆分为多个语句,但很可能会遇到事务大小限制。如果您可以非交易方式执行复制,则建议将 INSERT
分成几部分。类似于以下内容:
插入到table1(...)
SELECT ... FROM table2在哪里发票ID> $ 1 LIMIT 10000
退回发票ID
这里的想法是您以1万行的批次进行复制。您可以使用 RETURNING InvoiceID
子句来跟踪最后复制的 InvoiceID
,并从此处开始下一个插入。 / p>
I have to union 2 tables like below query. and 'table2' has 15GB data. But it show errors. I set max-sql-memory=.80 and I don't know how to solve this. When I execute this query with limit 50000 option, it works! Even 'select * from table2' shows same error. I think there are a select bottleneck somehow.... Also, with this query it is unusual only 1 of 3nodes's latency goes up. (AWS EC2 i3.xlarge type)
▶ Query
insert into table1 (
InvoiceID, PayerAccountId, LinkedAccountId, RecordType, RecordId, ProductName
)
select
InvoiceID, PayerAccountId, LinkedAccountId, RecordType, RecordId, ProductName
from table2;
▶ Error : driver: bad connection warning: connection lost! opening new connection: all session settings will be lost
▶ Log : W180919 04:59:20.452985 186 storage/raft_transport.go:465 [n3] raft transport stream to node 2 failed: rpc error: code = Unavailable desc = transport is closing W180919 04:59:20.452996 190 vendor/google.golang.org/grpc/clientconn.go:1158 grpc: addrConn.createTransport failed to connect to {10.240.98.xxx:26257 0 }. Err :connection error: desc = "transport: Error while dialing cannot reuse client connection". Reconnecting...
If I'm understanding your question correctly, you're using a single statement to read ~15GB of data from table2
and insert it into table1
. Unfortunately, as you've discovered this won't work. See limits for a single statement which covers exactly this scenario. Setting --max-sql-memory=.80
will not help and most likely will hurt as CockroachDB needs some breathing room as our memory tracking is not precise. The "bad connection warning" and the error you found in the logs are both symptoms which occur when a Cockroach process has crashed, presumably due to running out of memory.
If you need to copy the data from table2
to table1
transactionally then you're a bit out of luck at this time. While you could try using an explicit transaction and breaking the single INSERT
statement into multiple statements, you'll very likely run into transaction size limits. If you can handle performing the copy non-transactionally then I'd suggest breaking the INSERT
into pieces. Something along the lines of:
INSERT INTO table1 (...)
SELECT ... FROM table2 WHERE InvoiceID > $1 LIMIT 10000
RETURNING InvoiceID
The idea here is that you copy in 10k row batches. You would use the RETURNING InvoiceID
clause to track the last InvoiceID
that was copied and start the next insert from there.
这篇关于选择瓶颈并插入选择对蟑螂数据库无效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!