为什么此Java方法会泄漏-为何内联它可以解决泄漏? [英] why does this Java method leak—and why does inlining it fix the leak?

查看:80
本文介绍了为什么此Java方法会泄漏-为何内联它可以解决泄漏?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个最小的有点懒惰的(int)序列类,dropTest()的行为应与nthTest()的行为相同.

但这并不完全相同! dropTest()直到N达到1e8时才完成,而nthTest()失败,而OutOfMemoryError则将N减小到1e6.

我避免了内部类.而且我已经试验了我的代码的变体,nthTest()有泄漏而dropTest()没有泄漏?

这是解决方案

在您的方法中

static int nth(final int n, final LazyishSeq lazySeq) {
    return drop(n, lazySeq).head();
}

在整个drop操作期间,参数变量lazySeq保留对序列中第一个元素的引用.这样可以防止整个序列被垃圾收集.

相反,

public void dropTest() {
    assertThat( drop(N, naturals()).head(), is(N+1));
}

序列中的第一个元素由naturals()返回,并直接传递给drop的调用,因此从操作数堆栈中删除,并且在执行drop时不存在.

您尝试将参数变量设置为null,即

static int nth(final int n, /*final*/ LazyishSeq lazySeqArg) {
    final LazyishSeq lazySeqLocal = lazySeqArg;
    lazySeqArg = null;
    return drop(n,lazySeqLocal).head();
}

没有帮助,因为现在lazySeqArg变量是null,但是lazySeqLocal保留了对第一个元素的引用.

局部变量通常不会阻止垃圾回收,否则允许未使用的对象的收集 ,但这并不意味着特定的实现可以做到这一点.

对于HotSpot JVM,只有经过优化的代码才能摆脱此类未使用的引用.但是在这里,nth并不是热点,因为在drop方法中发生了很多繁重的事情.

这是为什么尽管在它的参数变量中也保留了对第一个元素的引用,但在drop方法上没有出现相同问题的原因. drop方法包含执行实际工作的循环,因此,很可能会被JVM优化,这可能会导致JVM消除未使用的变量,从而使序列中已处理的部分得以收集.

有很多因素可能会影响JVM的优化.除了代码的不同形状外,似乎在未优化阶段快速分配内存也可能会减少优化器的改进.确实,当我使用-Xcompile运行时,要完全禁止解释执行,两个变体都可以成功运行,即使int N = (int)1e9也不再是问题.当然,强制编译会增加启动时间.

我不得不承认,我不明白为什么混合模式会导致更加糟糕,因此我将作进一步调查.但是通常,您必须意识到垃圾收集器的效率取决于实现,因此在一个环境中收集的对象可能会留在另一个环境中的内存中.

I wrote a minimal somewhat-lazy (int) sequence class, GarbageTest.java, as an experiment, to see if I could process very long, lazy sequences in Java, the way I can in Clojure.

Given a naturals() method that returns the lazy, infinite, sequence of natural numbers; a drop(n,sequence) method that drops the first n elements of sequence and returns the rest of the sequence; and an nth(n,sequence) method that returns simply: drop(n, lazySeq).head(), I wrote two tests:

static int N = (int)1e6;

// succeeds @ N = (int)1e8 with java -Xmx10m
@Test
public void dropTest() {
    assertThat( drop(N, naturals()).head(), is(N+1));
}

// fails with OutOfMemoryError @ N = (int)1e6 with java -Xmx10m
@Test
public void nthTest() {
    assertThat( nth(N, naturals()), is(N+1));
}

Note that the body of dropTest() was generated by copying the body of nthTest() and then invoking IntelliJ's "inline" refactoring on the nth(N, naturals()) call. So it seems to me that the behavior of dropTest() should be identical to the behavior of nthTest().

But it isn't identical! dropTest() runs to completion with N up to 1e8 whereas nthTest() fails with OutOfMemoryError for N as small as 1e6.

I've avoided inner classes. And I've experimented with a variant of my code, ClearingArgsGarbageTest.java, that nulls method parameters before calling other methods. I've applied the YourKit profiler. I've looked at the byte code. I just cannot find the leak that causes nthTest() to fail.

Where's the "leak"? And why does nthTest() have the leak while dropTest() does not?

Here's the rest of the code from GarbageTest.java in case you don't want to click through to the Github project:

/**
 * a not-perfectly-lazy lazy sequence of ints. see LazierGarbageTest for a lazier one
 */
static class LazyishSeq {
    final int head;

    volatile Supplier<LazyishSeq> tailThunk;
    LazyishSeq tailValue;

    LazyishSeq(final int head, final Supplier<LazyishSeq> tailThunk) {
        this.head = head;
        this.tailThunk = tailThunk;
        tailValue = null;
    }

    int head() {
        return head;
    }

    LazyishSeq tail() {
        if (null != tailThunk)
            synchronized(this) {
                if (null != tailThunk) {
                    tailValue = tailThunk.get();
                    tailThunk = null;
                }
            }
        return tailValue;
    }
}

static class Incrementing implements Supplier<LazyishSeq> {
    final int seed;
    private Incrementing(final int seed) { this.seed = seed;}

    public static LazyishSeq createSequence(final int n) {
        return new LazyishSeq( n, new Incrementing(n+1));
    }

    @Override
    public LazyishSeq get() {
        return createSequence(seed);
    }
}

static LazyishSeq naturals() {
    return Incrementing.createSequence(1);
}

static LazyishSeq drop(
        final int n,
        final LazyishSeq lazySeqArg) {
    LazyishSeq lazySeq = lazySeqArg;
    for( int i = n; i > 0 && null != lazySeq; i -= 1) {
        lazySeq = lazySeq.tail();
    }
    return lazySeq;
}

static int nth(final int n, final LazyishSeq lazySeq) {
    return drop(n, lazySeq).head();
}

解决方案

In your method

static int nth(final int n, final LazyishSeq lazySeq) {
    return drop(n, lazySeq).head();
}

the parameter variable lazySeq hold a reference to the first element of your sequence during the entire drop operation. This prevents the entire sequence from getting garbage collected.

In contrast, with

public void dropTest() {
    assertThat( drop(N, naturals()).head(), is(N+1));
}

the first element of your sequence is returned by naturals() and directly passed to the invocation of drop, thus removed from the operand stack and does not exist during the execution of drop.

Your attempt to set the parameter variable to null, i.e.

static int nth(final int n, /*final*/ LazyishSeq lazySeqArg) {
    final LazyishSeq lazySeqLocal = lazySeqArg;
    lazySeqArg = null;
    return drop(n,lazySeqLocal).head();
}

does not help, as now, the lazySeqArg variable is null, but the lazySeqLocal holds a reference to the first element.

A local variable does not prevent garbage collection in general, the collection of otherwise unused objects is permitted, but that doesn’t imply that a particular implementation is capable of doing it.

In case of the HotSpot JVM, only optimized code will get rid of such unused references. But here, nth is not a hot spot, as the heavy things happen within drop method.

This is the reason why the same issue does not appear at the drop method, despite it also holds a reference to the first element in its parameter variable. The drop method contains the loop doing the actual work, hence, is very likely to get optimized by the JVM, which may cause it to eliminate unused variables, allowing the already processed part of the sequence to become collected.

There are many factors which may affect the JVM’s optimizations. Besides the different shape of the code, it seems that that rapid memory allocations during the unoptimized phase may also reduce the optimizer’s improvements. Indeed, when I run with -Xcompile, to forbid interpreted execution altogether, both variants run successfully, even int N = (int)1e9 is no problem anymore. Of course, forcing compilation raises the startup time.

I have to admit that I do not understand why the mixed mode performs that much worse and I’ll investigate further. But generally, you have to be aware that the efficiency of the garbage collector is implementation dependent, so objects collected in one environment may stay in memory in another.

这篇关于为什么此Java方法会泄漏-为何内联它可以解决泄漏?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆