优化大型Java数据数组的处理和管理 [英] Optimizing processing and management of large Java data arrays

查看:209
本文介绍了优化大型Java数据数组的处理和管理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在写一些pretty CPU密集型并行数值code,将处理大量存储在Java数组的数据(例如,大量的双[100000] S)。一些算法可能会在数天,以便获得最大的稳态性能是一个高优先级运行数百万次。<​​/ P>

在本质上,每个算法是具有方法的API像Java对象

 公共双[] runMyAlgorithm(双[] inputData);

或可替换地的参考可以被传递到数组来存储输出数据:

 公共runMyAlgorithm(双[] inputData,双[] outputData);

鉴于这一要求

,我试图确定分配/管理阵列空间的最优策略。频繁的算法将需要大量的临时存储空间。他们也将大数组作为输入,并创建大型阵列作为输出。

在我正在考虑的选项是:


  • 任何需要他们总是分配新的数组作为局部变量(例如新的双[100000])。可能是最简单的方法,但会产生的很多的垃圾。

  • $ P $对分配临时数组,并将它们存储作为​​算法对象最后字段 - 大缺点是,这将意味着只有一个线程可能在任一时刻运行算法

  • 保留pre-分配的临时数组中的ThreadLocal存储,使线程可以使用的临时数组一个固定的空间量时,它需要它。 ThreadLocal的将被要求,因为多个线程会同时运行相同的算法。

  • 传递周围大量阵列作为参数(包括临时阵列为要使用的算法)。不好的,因为这将使得算法API极其丑陋如果调用方具有负责提供临时阵列空间....

  • 分配非常大的阵列(例如双[千万]),而且还提供补偿算法到阵列中,使不同的线程将独立使用数组的不同区域。显然需要一些code管理阵列范围的偏移和分配。

任何思考哪种方法将是最好的(为什么)?


解决方案

在什么用java内存工作时,我已经注意到如下。如果你的内存需要的模式很简单(主要是2-3类型的内存分配),你通常可以比默认的分配器更好。您可以preallocate缓冲池在应用程序启动并根据需要使用它们或(必要时在开始分配一个巨大的数组,并提供那件)进入另一条路线。实际上你编写自己的内存分配器。但是,机会是你会做的比Java的默认分配一个最糟糕的工作。

我可能会尽量做到以下几点:规范缓冲区大小和正常分配。一段时间后,这种方式唯一的内存分配/释放将在固定尺寸,这将极大地帮助垃圾收集器运行速度快。另一件事我会做的是确保在算法设计时,在任何一个点所需要的总内存不会为了不触发完整集合无意中超过有点像机的内存的80-85%。

除了这些启发式我可能会测试任何解决方案,我会挑,看看它是如何在实践中的地狱。

I'm writing some pretty CPU-intensive, concurrent numerical code that will process large amounts of data stored in Java arrays (e.g. lots of double[100000]s). Some of the algorithms might run millions of times over several days so getting maximum steady-state performance is a high priority.

In essence, each algorithm is a Java object that has an method API something like:

   public double[] runMyAlgorithm(double[] inputData);

or alternatively a reference could be passed to the array to store the output data:

   public runMyAlgorithm(double[] inputData, double[] outputData);

Given this requirement, I'm trying to determine the optimal strategy for allocating / managing array space. Frequently the algorithms will need large amounts of temporary storage space. They will also take large arrays as input and create large arrays as output.

Among the options I am considering are:

  • Always allocate new arrays as local variables whenever they are needed (e.g. new double[100000]). Probably the simplest approach, but will produce a lot of garbage.
  • Pre-allocate temporary arrays and store them as final fields in the algorithm object - big downside would be that this would mean that only one thread could run the algorithm at any one time.
  • Keep pre-allocated temporary arrays in ThreadLocal storage, so that a thread can use a fixed amount of temporary array space whenever it needs it. ThreadLocal would be required since multiple threads will be running the same algorithm simultaneously.
  • Pass around lots of arrays as parameters (including the temporary arrays for the algorithm to use). Not good since it will make the algorithm API extremely ugly if the caller has to be responsible for providing temporary array space....
  • Allocate extremely large arrays (e.g. double[10000000]) but also provide the algorithm with offsets into the array so that different threads will use a different area of the array independently. Will obviously require some code to manage the offsets and allocation of the array ranges.

Any thoughts on which approach would be best (and why)?

解决方案

What i have noticed when working with memory in java is the following. If your memory needs patterns are simple (mostly 2-3 types of memory allocations) you can usually be better than the default allocator. You can either preallocate a pool of buffers at the application startup and use them as needed or go to the other route (allocate an huge array at the beginning and provide pieces of that when needed). In effect you are writing your own memory allocator. But chances are you will do a worst job than the default allocator of java.

I would probably try to do the following: standardize the buffer sizes and allocate normally. That way after a while the only memory allocation/deallocation will be in fixed sizes which will greatly help the garbage collector to run fast. Another thing i would do is to make sure at the algorithm design time that the total memory needed at any one point will not exceed something like 80-85% of the memory of the machine in order to not trigger a full collection inadvertently.

Apart from those heuristics i would probably test the hell of any solution i would pick and see how it works in practice.

这篇关于优化大型Java数据数组的处理和管理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆