NegativeArraySizeException ANTLRv4 [英] NegativeArraySizeException ANTLRv4

查看:98
本文介绍了NegativeArraySizeException ANTLRv4的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个10gb文件,我需要用Java解析它,而当我尝试执行此操作时会出现以下错误.

I have a 10gb file and I need to parse it in Java, whereas the following error arises when I attempt to do this.

java.lang.NegativeArraySizeException
        at java.util.Arrays.copyOf(Arrays.java:2894)
        at org.antlr.v4.runtime.ANTLRInputStream.load(ANTLRInputStream.java:123)
        at org.antlr.v4.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:86)
        at org.antlr.v4.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:82)
        at org.antlr.v4.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:90)

如何正确解决此问题?如何调整此类输入流以处理此错误?

How can I solve this problem properly? How can I adjust such an input stream to handle this error?

推荐答案

似乎ANTLR v4具有普遍的硬接线限制,即输入流大小小于2 ^ 31个字符.消除此限制并非易事.

It looks like ANTLR v4 has a pervasive hard-wired limitation that input stream size is less that 2^31 characters. Removing this limitation would not be a small task.

看看ANTLRInputStream类的源代码-它尝试将整个流内容保存在单个char[]中.对于庞大的输入文件,这是行不通的.但是,仅通过在更大的数据结构中缓冲数据来解决此问题也不是解决之道.如果进一步查看文件,还有许多其他方法将int用作索引流的类型.需要更改它们以使用long ...,并且更改将逐渐消失.

As you can see, it attempts to hold the entire stream contents in a single char[]. That ain't going to work ... for huge input files. But simply fixing that by buffering the data in a larger data structure isn't going to be the answer either. If you look further down the file, there are a number of other methods that use int as the type for indexing the stream. They would need to be changed to use long ... and the changes will ripple out.

如何正确解决此问题?如何调整此类输入流以处理此错误?

How can I solve this problem properly? How can I adjust such an input stream to handle this error?

想到两种方法:

  • 创建自己的ANTLR版本,该版本支持大型输入文件.这是一个不平凡的项目.我希望将32位假设应用于ANTLR生成的代码等.

  • Create your own version of ANTLR that supports large input files. This is a non-trivial project. I expect that the 32 bit assumption reaches into the code that ANTLR generates, etc.

在尝试解析输入文件之前,将它们分割成较小的文件.这是否可行取决于输入语法.

Split your input files into smaller files before you attempt to parse them. Whether this is viable depends on the input syntax.

我的建议是第二种选择. (通过内存缓冲)支持"巨大的输入文件的问题是效率低下,浪费内存……最终无法扩展.

My recommendation would be the 2nd alternative. The problem with "supporting" huge input files (by in-memory buffering) is that it is going to be inefficient and memory wasteful ... and it ultimately doesn't scale.

您还可以在此处创建问题,或在

You could also create an issue here, or ask on antlr-discussion.

这篇关于NegativeArraySizeException ANTLRv4的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆