为什么第一个函数调用的执行速度比所有其他顺序调用快两倍? [英] Why is the first function call is executed two times faster than all other sequential calls?

查看:30
本文介绍了为什么第一个函数调用的执行速度比所有其他顺序调用快两倍?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个自定义的 JS 迭代器实现和用于测量后一个实现的性能的代码:

const ITERATION_END = Symbol('ITERATION_END');const arrayIterator = (array) =>{让索引 = 0;返回 {hasValue: 真,下一个() {if (index >= array.length) {this.hasValue = false;返回 ITERATION_END;}返回数组[索引++];},};};const customIterator = (valueGetter) =>{返回 {hasValue: 真,下一个() {const nextValue = valueGetter();if (nextValue === ITERATION_END) {this.hasValue = false;返回 ITERATION_END;}返回下一个值;},};};const map = (iterator, selector) =>customIterator(() => {const value = iterator.next();返回值 === ITERATION_END ?值:选择器(值);});const filter = (iterator, predicate) =>customIterator(() => {如果(!迭代器.hasValue){返回 ITERATION_END;}让 currentValue = iterator.next();while (iterator.hasValue && currentValue !== ITERATION_END && !predicate(currentValue)) {currentValue = iterator.next();}返回当前值;});const toArray = (迭代器) =>{常量数组 = [];而(迭代器.hasValue){const value = iterator.next();如果(值!== ITERATION_END){数组推(值);}}返回数组;};const test = (fn, 迭代) =>{常量时间 = [];for (让 i = 0; i < 迭代; i++) {const start = performance.now();fn();times.push(performance.now() - 开始);}控制台日志(次);console.log(times.reduce((sum, x) => sum + x, 0)/times.length);}const createData = () =>Array.from({ 长度: 9000000 }, (_, i) => i + 1);const testIterator = (data) =>() =>toArray(map(filter(arrayIterator(data), x => x % 2 === 0), x => x * 2))测试(测试迭代器(创建数据()),10);

测试函数的输出非常奇怪且出乎意料——第一次测试运行的执行速度总是比其他所有运行快两倍.结果之一,其中数组包含所有执行时间,数字是平均值(我在 Node 上运行):

<预><代码>[147.9088459983468,396.3472499996424,374.82447600364685,367.74555300176144,363.6300039961934,362.44370299577713,363.8418449983001,390.86111199855804,360.23125199973583,358.4788999930024]348.6312940984964

使用 Deno 运行时可以观察到类似的结果,但是我无法在其他 JS 引擎上重现这种行为.V8 背后的原因是什么?

环境:节点 v13.8.0、V8 v7.9.317.25-node.28、Deno v1.3.3、V8 v8.6.334

解决方案

(此处为 V8 开发人员.)简而言之:它是内联还是缺少内联,取决于引擎启发式.

对于优化编译器来说,内联被调用的函数可以有显着的好处(例如:避免调用开销,有时可以进行常量折叠,或者消除重复计算,有时甚至为额外的内联创造新的机会),但是成本:它使编译本身变慢,并且由于某些事实证明不成立的假设而增加了以后必须丢弃优化代码(去优化")的风险.什么都不内联会浪费性能,内联一切都会浪费性能,内联正确的函数需要能够预测程序的未来行为,这显然是不可能的.所以编译器使用启发式方法.

V8 的优化编译器目前只有当它始终是在特定位置调用的相同函数时才对内联函数进行试探.在这种情况下,这就是第一次迭代的情况.随后的迭代会创建新的闭包作为回调,从 V8 的角度来看,它们是新函数,因此它们不会被内联.(V8 实际上知道一些高级技巧,允许它在某些情况下删除来自同一来源的重复函数实例并内联它们;但在这种情况下这些不适用[我不知道为什么]).

因此在第一次迭代中,所有内容(包括 x => x % 2 === 0x => x * 2)都被内联到 <代码>toArray.从第二次迭代开始,情况不再如此,而是生成的代码执行实际的函数调用.

那可能没问题;我猜想,在大多数实际应用中,差异几乎无法衡量.(减少的测试用例往往会使这种差异更加突出;但根据对小型测试的观察来更改大型应用程序的设计通常不是最有效的消磨时间的方式,最坏的情况可能会使事情变得更糟.)

此外,为引擎/编译器手动优化代码是一个困难的平衡.我通常会建议不要这样做(因为引擎会随着时间的推移而改进,而且让你的代码更快是他们的工作);另一方面,显然有更高效的代码和更低效的代码,为了最大限度地提高整体效率,每个参与者都需要尽自己的一份力量,也就是说,如果可以,您不妨让引擎的工作更简单.

如果您确实想微调它的性能,您可以通过分离代码和数据来实现,从而确保始终调用相同的函数.例如像你的代码的这个修改版本:

const ITERATION_END = Symbol('ITERATION_END');类数组迭代器{构造函数(数组){this.array = 数组;this.index = 0;}下一个() {if (this.index >= this.array.length) 返回 ITERATION_END;返回 this.array[this.index++];}}函数数组迭代器(数组){返回新的 ArrayIterator(array);}类 MapIterator {构造函数(源,修饰符){this.source = 来源;this.modifier = 修饰符;}下一个() {const 值 = this.source.next();返回值 === ITERATION_END ?值 : this.modifier(value);}}功能映射(迭代器,选择器){返回新的 MapIterator(iterator, selector);}类过滤迭代器{构造函数(源,谓词){this.source = 来源;this.predicate = 谓词;}下一个() {让值 = this.source.next();while (value !== ITERATION_END && !this.predicate(value)) {值 = this.source.next();}返回值;}}函数过滤器(迭代器,谓词){return new FilterIterator(iterator, predicate);}函数 toArray(迭代器){常量数组 = [];让价值;while ((value = iterator.next()) !== ITERATION_END) {数组推(值);}返回数组;}功能测试(fn,迭代){for (让 i = 0; i < 迭代; i++) {const start = performance.now();fn();console.log(performance.now() - 开始);}}函数创建数据(){return Array.from({length: 9000000 }, (_, i) => i + 1);};函数 even(x) { 返回 x % 2 === 0;}function double(x) { return x * 2;}功能测试迭代器(数据){返回函数 main() {return toArray(map(filter(arrayIterator(data), even), double));};}测试(测试迭代器(创建数据()),10);

观察如何在热路径上不再有动态创建的函数,并且公共接口"没有了.(即arrayIteratormapfiltertoArray的组合方式)和之前完全一样,只是引擎盖下的细节已经改变.为所有函数命名的好处是您可以获得更有用的分析输出 ;-)

精明的读者会注意到,这种修改只会将问题移开:如果您的代码中有几个地方使用不同的修饰符/谓词调用 mapfilter,那么内联问题将再次出现.正如我上面所说:微基准测试往往具有误导性,因为真实的应用程序通常具有不同的行为......

(FWIW,这与 为什么这个函数调用的执行时间会发生变化? .)

I have a custom JS iterator implementation and code for measuring performance of the latter implementation:

const ITERATION_END = Symbol('ITERATION_END');

const arrayIterator = (array) => {
  let index = 0;

  return {
    hasValue: true,
    next() {
      if (index >= array.length) {
        this.hasValue = false;

        return ITERATION_END;
      }

      return array[index++];
    },
  };
};

const customIterator = (valueGetter) => {
  return {
    hasValue: true,
    next() {
      const nextValue = valueGetter();

      if (nextValue === ITERATION_END) {
        this.hasValue = false;

        return ITERATION_END;
      }

      return nextValue;
    },
  };
};

const map = (iterator, selector) => customIterator(() => {
  const value = iterator.next();

  return value === ITERATION_END ? value : selector(value);
});

const filter = (iterator, predicate) => customIterator(() => {
  if (!iterator.hasValue) {
    return ITERATION_END;
  }

  let currentValue = iterator.next();

  while (iterator.hasValue && currentValue !== ITERATION_END && !predicate(currentValue)) {
    currentValue = iterator.next();
  }

  return currentValue;
});

const toArray = (iterator) => {
  const array = [];

  while (iterator.hasValue) {
    const value = iterator.next();

    if (value !== ITERATION_END) {
      array.push(value);
    }
  }

  return array;
};

const test = (fn, iterations) => {
  const times = [];

  for (let i = 0; i < iterations; i++) {
    const start = performance.now();
    fn();
    times.push(performance.now() - start);
  }

  console.log(times);
  console.log(times.reduce((sum, x) => sum + x, 0) / times.length);
}

const createData = () => Array.from({ length: 9000000 }, (_, i) => i + 1);

const testIterator = (data) => () => toArray(map(filter(arrayIterator(data), x => x % 2 === 0), x => x * 2))

test(testIterator(createData()), 10);

The output of the test function is very weird and unexpected - the first test run is constantly executed two times faster than all the other runs. One of the results, where the array contains all execution times and the number is the mean (I ran it on Node):

[
  147.9088459983468,
  396.3472499996424,
  374.82447600364685,
  367.74555300176144,
  363.6300039961934,
  362.44370299577713,
  363.8418449983001,
  390.86111199855804,
  360.23125199973583,
  358.4788999930024
]
348.6312940984964

Similar results can be observed using Deno runtime, however I could not reproduce this behaviour on other JS engines. What can be the reason behind it on the V8?

Environment: Node v13.8.0, V8 v7.9.317.25-node.28, Deno v1.3.3, V8 v8.6.334

解决方案

(V8 developer here.) In short: it's inlining, or lack thereof, as decided by engine heuristics.

For an optimizing compiler, inlining a called function can have significant benefits (e.g.: avoids the call overhead, sometimes makes constant folding possible, or elimination of duplicate computations, sometimes even creates new opportunities for additional inlining), but comes at a cost: it makes the compilation itself slower, and it increases the risk of having to throw away the optimized code ("deoptimize") later due to some assumption that turns out not to hold. Inlining nothing would waste performance, inlining everything would waste performance, inlining exactly the right functions would require being able to predict the future behavior of the program, which is obviously impossible. So compilers use heuristics.

V8's optimizing compiler currently has a heuristic to inline functions only if it was always the same function that was called at a particular place. In this case, that's the case for the first iterations. Subsequent iterations then create new closures as callbacks, which from V8's point of view are new functions, so they don't get inlined. (V8 actually knows some advanced tricks that allow it to de-duplicate function instances coming from the same source in some cases and inline them anyway; but in this case those are not applicable [I'm not sure why]).

So in the first iteration, everything (including x => x % 2 === 0 and x => x * 2) gets inlined into toArray. From the second iteration onwards, that's no longer the case, and instead the generated code performs actual function calls.

That's probably fine; I would guess that in most real applications, the difference is barely measurable. (Reduced test cases tend to make such differences stand out more; but changing the design of a larger app based on observations made on a small test is often not the most impactful way to spend your time, and at worst can make things worse.)

Also, hand-optimizing code for engines/compilers is a difficult balance. I would generally recommend not to do that (because engines improve over time, and it really is their job to make your code fast); on the other hand, there clearly is more efficient code and less efficient code, and for maximum overall efficiency, everyone involved needs to do their part, i.e. you might as well make the engine's job simpler when you can.

If you do want to fine-tune performance of this, you can do so by separating code and data, thereby making sure that always the same functions get called. For example like this modified version of your code:

const ITERATION_END = Symbol('ITERATION_END');

class ArrayIterator {
  constructor(array) {
    this.array = array;
    this.index = 0;
  }
  next() {
    if (this.index >= this.array.length) return ITERATION_END;
    return this.array[this.index++];
  }
}
function arrayIterator(array) {
  return new ArrayIterator(array);
}

class MapIterator {
  constructor(source, modifier) {
    this.source = source;
    this.modifier = modifier;
  }
  next() {
    const value = this.source.next();
    return value === ITERATION_END ? value : this.modifier(value);
  }
}
function map(iterator, selector) {
  return new MapIterator(iterator, selector);
}

class FilterIterator {
  constructor(source, predicate) {
    this.source = source;
    this.predicate = predicate;
  }
  next() {
    let value = this.source.next();
    while (value !== ITERATION_END && !this.predicate(value)) {
      value = this.source.next();
    }
    return value;
  }
}
function filter(iterator, predicate) {
  return new FilterIterator(iterator, predicate);
}

function toArray(iterator) {
  const array = [];
  let value;
  while ((value = iterator.next()) !== ITERATION_END) {
    array.push(value);
  }
  return array;
}

function test(fn, iterations) {
  for (let i = 0; i < iterations; i++) {
    const start = performance.now();
    fn();
    console.log(performance.now() - start);
  }
}

function createData() {
  return Array.from({ length: 9000000 }, (_, i) => i + 1);
};

function even(x) { return x % 2 === 0; }
function double(x) { return x * 2; }
function testIterator(data) {
  return function main() {
    return toArray(map(filter(arrayIterator(data), even), double));
  };
}

test(testIterator(createData()), 10);

Observe how there are no more dynamically created functions on the hot path, and the "public interface" (i.e. the way arrayIterator, map, filter, and toArray compose) is exactly the same as before, only under-the-hood details have changed. A benefit of giving all functions names is that you get more useful profiling output ;-)

Astute readers will notice that this modification only shifts the issue away: if you have several places in your code that call map and filter with different modifiers/predicates, then the inlineability issue will come up again. As I said above: microbenchmarks tend to be misleading, as real apps typically have different behavior...

(FWIW, this is pretty much the same effect as at Why is the execution time of this function call changing? .)

这篇关于为什么第一个函数调用的执行速度比所有其他顺序调用快两倍?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆