NoHostAvailableException与Cassandra&如果结果集较大,则为DataStax Java驱动程序 [英] NoHostAvailableException With Cassandra & DataStax Java Driver If Large ResultSet

查看:68
本文介绍了NoHostAvailableException与Cassandra&如果结果集较大,则为DataStax Java驱动程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


  • 2节点Cassandra 1.2.6集群

  • 副本= 2

  • 没有辅助索引的超大CQL3表

  • 行键是UUID.randomUUID()。toString()

  • 读取一致性设置为ONE

  • 使用DataStax Java驱动程序1.0

  • 2-node Cassandra 1.2.6 cluster
  • replicas=2
  • very large CQL3 table with no secondary index
  • Rowkey is a UUID.randomUUID().toString()
  • read consistency set to ONE
  • Using DataStax java driver 1.0

尝试通过 从schema.table LIMIT nnn; SELECT some-col选择SELECT-col来进行表扫描。

Attempting to do a table scan by "SELECT some-col from schema.table LIMIT nnn;"

一旦超出某个nnn限制,我便开始从驱动程序获取NoHostAvailableExceptions。

Once I go beyond a certain nnn LIMIT, I start to get NoHostAvailableExceptions from the driver.

它看起来像这样:

com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.181.13.239 ([/10.181.13.239] Unexpected exception triggered))
            at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:64)
            at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:214)
            at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:169)
            at com.jpmc.es.rtm.storage.impl.EventExtract.main(EventExtract.java:36)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:601)
            at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.181.13.239 ([/10.181.13.239] Unexpected exception triggered))
            at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:98)
            at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:165)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)



<给出:这可能不是对具有数百万行的大型表执行的最开明的事情,但这是我了解不该做的事情,因此,我非常感谢能够自愿解决这种错误的人调试。

Given: This is probably not the most enlightened thing to do to a large table with millions of rows, but this is how I learn what not to do, so I would really appreciate someone who could volunteer how this kind of error can be debugged.

例如,当如果发生这种情况,则没有迹象表明集群中的节点曾经遇到过请求问题(任一节点的日志中都没有任何指示超时或故障的信息)。此外,我在驱动程序上启用了跟踪,只要查询成功,它就会为您提供一些不错的自动跟踪(ala Oracle)信息。但是在这种情况下,驱动程序将发出NoHostAvailableException异常,并且没有ExecutionInfo可用,因此在这种情况下,跟踪没有提供任何好处。

For example, when this happens, there are no indications that the nodes in the cluster ever had an issue with the request (there is nothing in the logs on either node that indicate any timeout or failure). Also, I enabled the trace on the driver, which gives you some nice autotrace (ala Oracle) info as long as the query succeeds. But in this case, the driver blows a NoHostAvailableException and no ExecutionInfo is available, so tracing has not provided any benefit in this case.

我也觉得很有趣,这似乎没有被记录为超时(我的JMX控制台告诉我没有发生超时)。因此,我不了解故障实际发生的位置。我想到的是驱动程序有问题,但是我不知道如何调试它(我真的很想)。

I also find it interesting that this does not seem to be recorded as a timeout (my JMX consoles tell me no timeouts have occurred). So, I am left not understanding WHERE the failure is actually occurring. I am left with the idea that it is the driver that is having a problem, but I don't know how to debug it (and I would really like to).

我读过一些人的帖子,指出对resultSets> 10000行进行查询可能不是一个好主意,我愿意接受这一点,但是我想了解是什么导致了异常以及异常在哪里

I have read several posts from folks that state that query'g for resultSets > 10000 rows is probably not a good idea, and I am willing to accept this, but I would like to understand what is causing the exception and where the exception is happening.

FWIW,我还尝试提高cassandra.yaml中的超时属性,但这没有任何区别。

FWIW, I also tried bumping the timeout properties in the cassandra.yaml, but this made no difference whatsoever.

我欢迎任何建议,轶事,侮辱或金钱捐献,帮助我在白痴开发者家中注册。

I welcome any suggestions, anecdotes, insults, or monetary contributions for my registration in the house of moron-developers.

问候!

推荐答案

我的猜测(也许其他人可以确认)是您对查询施加了过多的负载,从而导致超时。因此,是的,调试起来有点困难,因为根本原因并不明显:我设置的限制太大还是集群实际上停机了?

My guess (and perhaps others can confirm) is that you are putting too high a load on the cluster by the query which is causing the timeout. So, yes, it's a little difficult to debug as it's not obvious what the root cause was: was the limit I set too large or is the cluster actually down?

您通常希望通过设置合理的限制并分页浏览结果来避免对单个查询中请求的数据量设置较大的限制,例如

You want to avoid setting large limits on the amount of data you request in a single query, typically by setting a reasonable limit and paging through the results, e.g.,

SELECT * FROM messages WHERE user_id = 101 LIMIT 1000;
SELECT * FROM messages WHERE user_id = 101 AND msg_id > [Last message ID received] LIMIT 1000;

在(请参阅此文件,此答案中的代码示例从此处复制)是datastax的一大改进java-driver,因为它消除了手动分页的需要,并允许您执行以下操作:

The Automatic Paging functionality added in (see this document, where the code examples in this answer are copied from) is a big improvement in datastax java-driver as it removes the need to manually page and lets you do the following:

Statement stmt = new SimpleStatement("SELECT * FROM images");
stmt.setFetchSize(100);
ResultSet rs = session.execute(stmt);

// Iterate over the ResultSet here

虽然这不一定解决您的问题,它将最大限度地减少它是太大查询的可能性。

While this won't necessarily solve your problem it will minimise the possibility that it was a "too-big" query.

这篇关于NoHostAvailableException与Cassandra&amp;如果结果集较大,则为DataStax Java驱动程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆