选择瓶颈并插入选择对蟑螂数据库无效 [英] select bottleneck and insert into select doesn't work on cockroach db

查看:130
本文介绍了选择瓶颈并插入选择对蟑螂数据库无效的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须合并2个表,如下所示。 table2具有15GB的数据。但是它显示错误。我设置了max-sql-memory = .80,但我不知道该如何解决。
当我执行此查询有50000个限制选项时,它可以工作!
即使 select * from table2也显示相同的错误。
我认为某种程度上存在选择瓶颈....
另外,使用此查询,通常只有3个节点的延迟中有1个上升。 (AWS EC2 i3.xlarge类型)



▶查询

插入表1中(
InvoiceID,PayerAccountId,LinkedAccountId,RecordType,RecordId, ProductName

从表2中选择
InvoiceID,PayerAccountId,LinkedAccountId,RecordType,RecordId,ProductName



▶错误:
驱动程序:连接错误
警告:连接丢失!
打开新连接:所有会话设置都将丢失



▶日志:
W180919 04:59:20.452985 186 storage / raft_transport.go:465 [n3]到节点2的筏传输流失败:rpc错误:代码=不可用desc =传输正在关闭
W180919 04:59:20.452996 190 vendor / google.golang.org / grpc / clientconn.go:1158 grpc: addrConn.createTransport无法连接到{10.240.98.xxx:26257 0}。错误:连接错误:desc =传输:拨号时无法重用客户端连接时出错。重新连接...

解决方案

如果我正确地理解了您的问题,则您正在使用一条语句读取〜15GB的 table2 中的数据并将其插入 table1 中。不幸的是,您发现这行不通。参见单个语句的限制涵盖了这种情况。设置-max-sql-memory = .80 将无济于事,并且很可能会造成伤害,因为CockroachDB需要一些喘息的空间,因为我们的内存跟踪不准确。 连接错误警告和您在日志中发现的错误都是Cockroach进程崩溃(可能是由于内存不足)而发生的症状。



如果您需要将数据从 table2 复制到 table1 ,那么此时您有点不走运。虽然您可以尝试使用显式事务并将单个 INSERT 语句拆分为多个语句,但很可能会遇到事务大小限制。如果您可以非交易方式执行复制,则建议将 INSERT 分成几部分。类似于以下内容:




插入到table1(...)
SELECT ... FROM table2在哪里发票ID> $ 1 LIMIT 10000
退回发票ID



这里的想法是您以1万行的批次进行复制。您可以使用 RETURNING InvoiceID 子句来跟踪最后复制的 InvoiceID ,并从此处开始下一个插入。 / p>

I have to union 2 tables like below query. and 'table2' has 15GB data. But it show errors. I set max-sql-memory=.80 and I don't know how to solve this. When I execute this query with limit 50000 option, it works! Even 'select * from table2' shows same error. I think there are a select bottleneck somehow.... Also, with this query it is unusual only 1 of 3nodes's latency goes up. (AWS EC2 i3.xlarge type)

▶ Query
insert into table1 ( InvoiceID, PayerAccountId, LinkedAccountId, RecordType, RecordId, ProductName ) select InvoiceID, PayerAccountId, LinkedAccountId, RecordType, RecordId, ProductName from table2;

▶ Error : driver: bad connection warning: connection lost! opening new connection: all session settings will be lost

▶ Log : W180919 04:59:20.452985 186 storage/raft_transport.go:465 [n3] raft transport stream to node 2 failed: rpc error: code = Unavailable desc = transport is closing W180919 04:59:20.452996 190 vendor/google.golang.org/grpc/clientconn.go:1158 grpc: addrConn.createTransport failed to connect to {10.240.98.xxx:26257 0 }. Err :connection error: desc = "transport: Error while dialing cannot reuse client connection". Reconnecting...

解决方案

If I'm understanding your question correctly, you're using a single statement to read ~15GB of data from table2 and insert it into table1. Unfortunately, as you've discovered this won't work. See limits for a single statement which covers exactly this scenario. Setting --max-sql-memory=.80 will not help and most likely will hurt as CockroachDB needs some breathing room as our memory tracking is not precise. The "bad connection warning" and the error you found in the logs are both symptoms which occur when a Cockroach process has crashed, presumably due to running out of memory.

If you need to copy the data from table2 to table1 transactionally then you're a bit out of luck at this time. While you could try using an explicit transaction and breaking the single INSERT statement into multiple statements, you'll very likely run into transaction size limits. If you can handle performing the copy non-transactionally then I'd suggest breaking the INSERT into pieces. Something along the lines of:

INSERT INTO table1 (...) SELECT ... FROM table2 WHERE InvoiceID > $1 LIMIT 10000 RETURNING InvoiceID

The idea here is that you copy in 10k row batches. You would use the RETURNING InvoiceID clause to track the last InvoiceID that was copied and start the next insert from there.

这篇关于选择瓶颈并插入选择对蟑螂数据库无效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆