在我的风暴集群读完一个AWS SQS队列时,是什么造成这些ParseError例外 [英] What's causing these ParseError exceptions when reading off an AWS SQS queue in my Storm cluster

查看:1216
本文介绍了在我的风暴集群读完一个AWS SQS队列时,是什么造成这些ParseError例外的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我用暴风0.8.1来读出一个Amazon SQS队列传入的消息,我得到一致的例外当这样做时:

I'm using Storm 0.8.1 to read incoming messages off an Amazon SQS queue and am getting consistent exceptions when doing so:

2013-12-02 02:21:38 executor [ERROR] 
java.lang.RuntimeException: com.amazonaws.AmazonClientException: Unable to unmarshall response (ParseError at [row,col]:[1,1]
Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.)
        at REDACTED.spouts.SqsQueueSpout.handleNextTuple(SqsQueueSpout.java:219)
        at REDACTED.spouts.SqsQueueSpout.nextTuple(SqsQueueSpout.java:88)
        at backtype.storm.daemon.executor$fn__3976$fn__4017$fn__4018.invoke(executor.clj:447)
        at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
        at clojure.lang.AFn.run(AFn.java:24)
        at java.lang.Thread.run(Thread.java:701)
Caused by: com.amazonaws.AmazonClientException: Unable to unmarshall response (ParseError at [row,col]:[1,1]
Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.)
        at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:524)
        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:298)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:167)
        at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:812)
        at com.amazonaws.services.sqs.AmazonSQSClient.receiveMessage(AmazonSQSClient.java:575)
        at REDACTED.spouts.SqsQueueSpout.handleNextTuple(SqsQueueSpout.java:191)
        ... 5 more
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]
Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.
        at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.setInputSource(XMLStreamReaderImpl.java:219)
        at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.<init>(XMLStreamReaderImpl.java:189)
        at com.sun.xml.internal.stream.XMLInputFactoryImpl.getXMLStreamReaderImpl(XMLInputFactoryImpl.java:277)
        at com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLStreamReader(XMLInputFactoryImpl.java:129)
        at com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLEventReader(XMLInputFactoryImpl.java:78)
        at com.amazonaws.http.StaxResponseHandler.handle(StaxResponseHandler.java:85)
        at com.amazonaws.http.StaxResponseHandler.handle(StaxResponseHandler.java:41)
        at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:503)
        ... 10 more

我已经调试队列中的数据,一切都看起来很不错。我想不通,为什么API的XML响应会导致这些问题。任何想法?

I've debugged the data on the queue and everything looks good. I can't figure out why the API's XML response would be causing these problems. Any ideas?

推荐答案

在这里回答我的问题的年龄。

Answering my own question here for the ages.

目前旗下有XML扩展限制处理Oracle和的OpenJDK的Java,结果在一个共用的反打解析多个XML文档时,上界默认的错误

There's currently an XML expansion limit processing bug in Oracle and OpenJDK's Java that results in a shared counter hitting the default upper bound when parsing multiple XML documents.

  1. <一个href="https://blogs.oracle.com/joew/entry/jdk_7u45_aws_issue_123">https://blogs.oracle.com/joew/entry/jdk_7u45_aws_issue_123
  2. <一个href="https://bugs.openjdk.java.net/browse/JDK-8028111">https://bugs.openjdk.java.net/browse/JDK-8028111
  3. <一个href="https://github.com/aws/aws-sdk-java/issues/123">https://github.com/aws/aws-sdk-java/issues/123
  1. https://blogs.oracle.com/joew/entry/jdk_7u45_aws_issue_123
  2. https://bugs.openjdk.java.net/browse/JDK-8028111
  3. https://github.com/aws/aws-sdk-java/issues/123

虽然我认为我们的版本(6b27-1.12.6-1ubuntu0.12.04.4)并没有受到影响,运行在OpenJDK的bug报告给出的样本code确实验证我们容易受到错误

Although I thought that our version (6b27-1.12.6-1ubuntu0.12.04.4) wasn't affected, running the sample code given in the OpenJDK bug report did indeed verify that we were susceptible to the bug.

要解决这个问题,我需要通过 jdk.xml.entityExpansionLimit = 0 风暴工人。通过添加以下内容 storm.yaml 在我的集群中,我能够缓解这个问题。

To work around the issue, I needed to pass jdk.xml.entityExpansionLimit=0 to the Storm workers. By adding the following to storm.yaml across my cluster, I was able to mitigate this problem.

supervisor.childopts: "-Djdk.xml.entityExpansionLimit=0"
worker.childopts: "-Djdk.xml.entityExpansionLimit=0"

我要指出,这在技术上可能让你拒绝服务攻击,但由于我们的XML文档只从SQS来了,我不担心有人恶意伪造XML杀死我们的工人。

I should note that this technically opens you up to a Denial of Service attack, but since our XML documents are only coming from SQS, I'm not worried about someone forging malevolent XML to kill our workers.

这篇关于在我的风暴集群读完一个AWS SQS队列时,是什么造成这些ParseError例外的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆