为什么此函数调用的执行时间会改变? [英] Why is the execution time of this function call changing?

查看:72
本文介绍了为什么此函数调用的执行时间会改变?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题似乎仅影响Chrome / V8,在Firefox或其他浏览器中可能无法再现。总之,如果在其他任何地方使用新的回调函数调用函数,则函数回调函数的执行时间将增加一个数量级或更多。

This issue seems to only affect Chrome/V8, and may not be reproducible in Firefox or other browsers. In summary, the execution time of a function callback increases by an order of magnitude or more if the function is called with a new callback anywhere else.

任意多次调用 test(callback)可以正常工作,但是一旦调用 test(differentCallback),无论提供了什么回调, test 函数的执行时间都会大大增加(即,再次调用 test(callback)也将受到影响。)

Calling test(callback) arbitrarily many times works as expected, but once you call test(differentCallback), the execution time of the test function increases dramatically no matter what callback is provided (i.e., another call to test(callback) would suffer as well).

此示例已更新为使用自变量,以便不会被优化为空循环。回调参数 a b 相加并加到总计

This example was updated to use arguments so as to not be optimized to an empty loop. Callback arguments a and b are summed and added to total, which is logged.

function test(callback) {
    let start = performance.now(),
        total = 0;

    // add callback result to total
    for (let i = 0; i < 1e6; i++)
        total += callback(i, i + 1);

    console.log(`took ${(performance.now() - start).toFixed(2)}ms | total: ${total}`);
}

let callback1 = (a, b) => a + b,
    callback2 = (a, b) => a + b;

console.log('FIRST CALLBACK: FASTER');
for (let i = 1; i < 10; i++)
    test(callback1);

console.log('\nNEW CALLBACK: SLOWER');
for (let i = 1; i < 10; i++)
    test(callback2);

我正在开发 StateMachine 类()用于我正在编写和编写的库逻辑按预期工作,但是在分析它时,我遇到了一个问题。我注意到,当我运行概要分析代码片段(在全局范围内)时,只需大约8毫秒即可完成,但是如果我第二次运行它,则可能需要50毫秒,最终会膨胀到400毫秒。通常,随着V8引擎对其进行优化,一遍又一遍地运行相同的命名函数会导致其执行时间 drop ,但似乎相反。

I am developing a StateMachine class (source) for a library I'm writing and the logic works as expected, but in profiling it, I've run into an issue. I noticed that when I ran the profiling snippet (in global scope), it would only take about 8ms to finish, but if I ran it a second time, it would take up to 50ms and eventually balloon as high as 400ms. Typically, running the same named function over and over will cause its execution time to drop as the V8 engine optimizes it, but the opposite seems to be happening here.

我已经能够通过将其包装在闭合中来解决该问题,但是随后我注意到了另一个奇怪的副作用:调用依赖于 StateMachine 的其他函数

I've been able to get rid of the problem by wrapping it in a closure, but then I noticed another weird side effect: Calling a different function that relies on the StateMachine class would break the performance for all code depending on the class.

该类非常简单-您可以在构造函数中为其指定初始状态,也可以使用 init ,则可以使用 update 方法更新状态,该方法将传递一个接受 this.state 作为参数(通常会对其进行修改)。 transition 是用于更新状态直到 transitionCondition 不再满足。

The class is pretty simple - you give it an initial state in the constructor or init, and you can update the state with the update method, which you pass a callback that accepts this.state as an argument (and usually modifies it). transition is a method that is used to update the state until the transitionCondition is no longer met.

提供了两个测试功能:红色蓝色相同,并且每个都将生成初始状态为 {的 StateMachine {测试:0 } 并使用 transition 方法来更新状态,而状态测试1e6 。最终状态为 {测试:1000000}

Two test functions are provided: red and blue, which are identical, and each will generate a StateMachine with an initial state of { test: 0 } and use the transition method to update the state while state.test < 1e6. The end state is { test: 1000000 }.

您可以通过点击红色或蓝色按钮来触发配置文件,它将运行 StateMachine.transition 50次,并记录调用完成的平均时间。如果您反复单击红色或蓝色按钮,则会发现它的时钟运行时间少于10毫秒,没有问题- but ,一旦您单击 other 按钮并调用另一个版本的相同功能,一切都会中断,并且两个功能的执行时间将增加大约一个数量级。

You can trigger the profile by clicking the red or blue button, which will run StateMachine.transition 50 times and log the average time the call took to complete. If you click the red or blue button repeatedly, you will see that it clocks in at less than 10ms without issue - but, once you click the other button and call the other version of the same function, everything breaks, and the execution time for both functions will increase by about an order of magnitude.

// two identical functions, red() and blue()

function red() {
  let start = performance.now(),
      stateMachine = new StateMachine({
        test: 0
      });

  stateMachine.transition(
    state => state.test++, 
    state => state.test < 1e6
  );

  if (stateMachine.state.test !== 1e6) throw 'ASSERT ERROR!';
  else return performance.now() - start;
}

function blue() {
  let start = performance.now(),
      stateMachine = new StateMachine({
        test: 0
      });

  stateMachine.transition(
    state => state.test++, 
    state => state.test < 1e6
  );

  if (stateMachine.state.test !== 1e6) throw 'ASSERT ERROR!';
  else return performance.now() - start;
}

// display execution time
const display = (time) => document.getElementById('results').textContent = `Avg: ${time.toFixed(2)}ms`;

// handy dandy Array.avg()
Array.prototype.avg = function() {
  return this.reduce((a,b) => a+b) / this.length;
}

// bindings
document.getElementById('red').addEventListener('click', () => {
  const times = [];
  for (var i = 0; i < 50; i++)
    times.push(red());
    
  display(times.avg());
}),

document.getElementById('blue').addEventListener('click', () => {
  const times = [];
  for (var i = 0; i < 50; i++)
    times.push(blue());
    
  display(times.avg());
});

<script src="https://cdn.jsdelivr.net/gh/TeleworkInc/state-machine@bd486a339dca1b3ad3157df20e832ec23c6eb00b/StateMachine.js"></script>

<h2 id="results">Waiting...</h2>
<button id="red">Red Pill</button>
<button id="blue">Blue Pill</button>

<style>
body{box-sizing:border-box;padding:0 4rem;text-align:center}button,h2,p{width:100%;margin:auto;text-align:center;font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,Helvetica,Arial,sans-serif,"Apple Color Emoji","Segoe UI Emoji","Segoe UI Symbol"}button{font-size:1rem;padding:.5rem;width:180px;margin:1rem 0;border-radius:20px;outline:none;}#red{background:rgba(255,0,0,.24)}#blue{background:rgba(0,0,255,.24)}
</style>

最终,此行为是意外的,IMO认为是一个不重要的错误。对我的影响是巨大的-在3.900GHz下的Intel i7-4770(8)上,我的执行时间从平均2ms变为45ms(增加了20倍)。

Ultimately, this behavior is unexpected and, IMO, qualifies as a nontrivial bug. The impact for me is significant - on Intel i7-4770 (8) @ 3.900GHz, my execution times in the example above go from an average of 2ms to 45ms (a 20x increase).

对于非平凡性,请考虑在第一个调用之后,对 StateMachine.transition 的任何后续调用都会不必要地变慢,无论其范围或位置如何。码。 SpiderMonkey不会减慢随后对 transition 的调用的事实,这向我发出信号,表明在V8中此特定优化逻辑还有改进的空间。

As for nontriviality, consider that any subsequent calls to StateMachine.transition after the first one will be unnecessarily slow, regardless of scope or location in the code. The fact that SpiderMonkey does not slow down subsequent calls to transition signals to me that there is room for improvement for this specific optimization logic in V8.

请参见下文,其中对 StateMachine.transition 的后续调用会变慢:

See below, where subsequent calls to StateMachine.transition are slowed:

// same source, several times

// 1
(function() {
  let start = performance.now(),
    stateMachine = new StateMachine({
      test: 0
    });

  stateMachine.transition(state => state.test++, state => state.test < 1e6);

  if (stateMachine.state.test !== 1e6) throw 'ASSERT ERROR!';
  console.log(`took ${performance.now() - start}ms`);
})();


// 2 
(function() {
  let start = performance.now(),
    stateMachine = new StateMachine({
      test: 0
    });

  stateMachine.transition(state => state.test++, state => state.test < 1e6);

  if (stateMachine.state.test !== 1e6) throw 'ASSERT ERROR!';
  console.log(`took ${performance.now() - start}ms`);
})();

// 3
(function() {
  let start = performance.now(),
    stateMachine = new StateMachine({
      test: 0
    });

  stateMachine.transition(state => state.test++, state => state.test < 1e6);

  if (stateMachine.state.test !== 1e6) throw 'ASSERT ERROR!';
  console.log(`took ${performance.now() - start}ms`);
})();

<script src="https://cdn.jsdelivr.net/gh/TeleworkInc/state-machine@bd486a339dca1b3ad3157df20e832ec23c6eb00b/StateMachine.js"></script>

可以通过将代码包装在名为 的闭包中来避免这种性能下降,在这种情况下,优化器可能知道回调不会改变:

This performance decrease can be avoided by wrapping the code in a named closure, where presumably the optimizer knows the callbacks will not change:

var test = (function() {
    let start = performance.now(),
        stateMachine = new StateMachine({
            test: 0
        });
  
    stateMachine.transition(state => state.test++, state => state.test < 1e6);
  
    if (stateMachine.state.test !== 1e6) throw 'ASSERT ERROR!';
    console.log(`took ${performance.now() - start}ms`);
});

test();
test();
test();

<script src="https://cdn.jsdelivr.net/gh/TeleworkInc/state-machine@bd486a339dca1b3ad3157df20e832ec23c6eb00b/StateMachine.js"></script>

$ uname -a
Linux workspaces 5.4.0-39-generic #43-Ubuntu SMP Fri Jun 19 10:28:31 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ google-chrome --version
Google Chrome 83.0.4103.116


推荐答案

V8开发人员。这不是错误,只是V8所做的优化。有趣的是,Firefox似乎可以做到这一点……

V8 developer here. It's not a bug, it's just an optimization that V8 doesn't do. It's interesting to see that Firefox seems to do it...

FWIW,我看不到迅速膨胀到400ms;相反(类似于乔恩·特伦特(Jon Trent)的评论),我首先看到大约2.5毫秒,然后看到11毫秒。

FWIW, I don't see "ballooning to 400ms"; instead (similar to Jon Trent's comment) I see about 2.5ms at first, and then around 11ms.

这里是解释:

当您单击只有一个按钮,然后 transition 只会看到一个回调。 (严格来说,每次都是箭头函数的新实例,但是由于它们都源于源代码中的相同函数,因此出于类型反馈跟踪的目的,它们被删除。此外,严格来说,这是一个回调每个 用于 stateTransition transitionCondition ,但这只是重复了这种情况;只有一个人可以复制它。 )当优化转换时,优化编译器决定内联被调用函数,因为过去只看到过一个函数,因此可以很自信地猜测它是将来也总是会成为一种功能。由于该函数的工作量很小,因此避免了调用它的开销,从而大大提高了性能。

When you click only one button, then transition only ever sees one callback. (Strictly speaking it's a new instance of the arrow function every time, but since they all stem from the same function in the source, they're "deduped" for type feedback tracking purposes. Also, strictly speaking it's one callback each for stateTransition and transitionCondition, but that just duplicates the situation; either one alone would reproduce it.) When transition gets optimized, the optimizing compiler decides to inline the called function, because having seen only one function there in the past, it can make a high-confidence guess that it's also always going to be that one function in the future. Since the function does extremely little work, avoiding the overhead of calling it provides a huge performance boost.

单击第二个按钮后,过渡看到第二个功能。第一次发生时必须对其进行优化。由于它仍然很热,它会在不久后进行重新优化,但是这次优化器决定不进行内联,因为以前它不只看到一个功能,而且内联非常昂贵。结果是,从现在开始,您将看到实际执行这些呼叫所需的时间。 (两个函数具有相同来源的事实无关紧要;检查那是不值得的,因为在玩具示例之外几乎永远不会这样。)

Once the second button is clicked, transition sees a second function. It must get deoptimized the first time this happens; since it's still hot it'll get reoptimized soon after, but this time the optimizer decides not to inline, because it's seen more than one function before, and inlining can be very expensive. The result is that from this point onwards, you'll see the time it takes to actually perform these calls. (The fact that both functions have identical source doesn't matter; checking that wouldn't be worth it because outside of toy examples that would almost never be the case.)

有一种解决方法,但这有点骇人听闻,而且我不建议您在用户代码中加入骇客来说明引擎行为。 V8确实支持多态内联,但是(当前)仅当它可以从某个对象的类型推断出调用目标时。因此,如果您构建 config,将具有正确功能的对象作为方法安装在其原型上,则可以使V8内联它们。像这样:

There's a workaround, but it's something of a hack, and I don't recommend putting hacks into user code to account for engine behavior. V8 does support "polymorphic inlining", but (currently) only if it can deduce the call target from some object's type. So if you construct "config" objects that have the right functions installed as methods on their prototype, you can get V8 to inline them. Like so:

class StateMachine {
  ...
  transition(config, maxCalls = Infinity) {
    let i = 0;
    while (
      config.condition &&
      config.condition(this.state) &&
      i++ < maxCalls
    ) config.transition(this.state);

    return this;
  }
  ...
}

class RedConfig {
  transition(state) { return state.test++ }
  condition(state) { return state.test < 1e6 }
}
class BlueConfig {
  transition(state) { return state.test++ }
  condition(state) { return state.test < 1e6 }
}

function red() {
  ...
  stateMachine.transition(new RedConfig());
  ...
}
function blue() {
  ...
  stateMachine.transition(new BlueConfig());
  ...
}




可能值得申报一个错误( crbug.com/v8/new )询问编译器团队是否认为这值得改进。从理论上讲,应该可以内联直接调用的多个函数,并根据所调用的函数变量的值在内联路径之间进行分支。但是,我不确定在很多情况下影响会像在此简单基准测试中一样明显,而且我知道最近的趋势是内联 less 而不是更多,因为平均而言以获得更好的权衡(内联有弊端,是否值得这样做总是一个猜测,因为引擎必须预测未来才能确定)。


It might be worth filing a bug (crbug.com/v8/new) to ask if the compiler team thinks that this is worth improving. Theoretically it should be possible to inline several functions that are called directly, and branch between the inlined paths based on the value of the function variable that's being called. However I'm not sure there are many cases where the impact is as pronounced as in this simple benchmark, and I know that recently the trend has been towards inlining less rather than more, because on average that tends to be the better tradeoff (there are drawbacks to inlining, and whether it's worth it is necessarily always a guess, because the engine would have to predict the future in order to be sure).

总之,使用许多回调进行编码是一种非常灵活且通常优雅的技术,但往往会以效率为代价。 (还有其他种种效率低下的问题:例如,带有内联箭头功能的调用,如 transition(state => state.something)每次执行时都会分配一个新的功能对象;有时引擎可能能够优化开销,有时却没有。

In conclusion, coding with many callbacks is a very flexible and often elegant technique, but it tends to come at an efficiency cost. (There are other varieties of inefficiency: e.g. a call with an inline arrow function like transition(state => state.something) allocates a new function object each time it's executed; that just so happens not to matter much in the example at hand.) Sometimes engines might be able to optimize away the overhead, and sometimes not.

这篇关于为什么此函数调用的执行时间会改变?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆