在大数组上使用多个".filter"调用是否不利于Javascript性能? [英] Is using several '.filter' calls on a big array bad for performance in Javascript?

查看:99
本文介绍了在大数组上使用多个".filter"调用是否不利于Javascript性能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了这段代码来过滤单词数组.我为要过滤掉的每种类型的单词编写了一个过滤器函数,并将它们顺序地应用于数组:

  const wordArray = rawArray.filter(removeNonDomainWords).filter(removeWordsWithDigits).filter(removeWordsWithInsideNonWordChars).filter(removeEmptyWords).filter(removeSearchTerm,term).map(word => replaceNonWordCharsFromStartAndEnd(word)) 

如果我没有记错的话,这段代码将遍历整个数组六次.

编写一种(在我的场景中更复杂,但仍然很简单)过滤器函数,将这些过滤器函数逻辑组合以达到相同的结果,效率会更高吗?

我在函数式编程的上下文中了解了过滤器,它可以使我的代码更短,更快.这就是为什么我可能不怀疑自己在写什么,以为我正在做FP,这一定很好".

谢谢!

解决方案

好吧,它确实重复了六次,但不一定要遍历整个初始数组.每次过滤后,它都会变小.拥有一种过滤器方法会更有效,但是差异可能不如您预期的那么大.

如果您仍想使用此解决方案,则可以先使用选择性最高的过滤器(也就是预期过滤效果最强的过滤器)来提高性能.这样,以下数组将更小,并且遍历的数组也将更少.

正如@Redu指出(在注释中),您可以使用 || 运算符链接过滤器.这将确保您只进行一次迭代.


其背后的原因是 Array.prototype.filter 返回一个新数组.将其与Java Stream API进行比较,该API返回一个流,因此可以在调用列表中深度优先".不利的一面是您最终需要终端操作来收集"您的结果.

在javascript中

  rawArray.filter(x) 

迭代 rawArray 并返回一个新的过滤数组-可以依次过滤或照原样使用.这将导致对 rawArray 中的每个元素的每个调用 x .

在Java中,等效值为

  rawArray.stream().filter(x) 

这实际上根本不执行任何操作.不会调用 x .返回值将是 Stream ,稍后可以使用.可以对其进行进一步过滤,但是直到使用终端操作以某种方式收集值后,才能进行调用.

让我们比较一下javascript

  rawArray.filter(x).filter(y).length 

到Java

  rawArray.stream().filter(x).filter(y).count() 

在javascript中,这将首先遍历 rawArray 的所有元素,为每个元素调用 x ,并将结果存储在中间数组中.然后,javascript引擎将遍历中间数组的所有元素,为每个元素调用 y ,并将结果存储在第二个中间数组中,然后它将检查其大小.

在Java中,此代码段将导致VM迭代 rawArray 的元素,首先调用 x ,如果 x true ,然后在每个元素上调用 y ,如果仍然是 true ,则递增计数器.将没有中间数组,并且在数据集上只有一个迭代.

函数式编程很有趣,如果使用得当,它可以创建更少的代码,而代码则不那么复杂,理想情况下甚至更易于阅读,但是它确实将很多责任移交给了框架(或引擎,VM或其他任何东西)),并且很重要的一点是,必须认识到看似相似的代码虽然行为相似,但在不同环境中的执行效果却大不相同.

I wrote this piece of code to filter an array of words. I wrote a filter function for every type of word I want to filter out and apply them sequentially to the array:

  const wordArray = rawArray.filter(removeNonDomainWords)
                            .filter(removeWordsWithDigits)
                            .filter(removeWordsWithInsideNonWordChars)
                            .filter(removeEmptyWords)
                            .filter(removeSearchTerm, term)
                            .map(word => replaceNonWordCharsFromStartAndEnd(word))

This code iterates over the whole array six times if I am not mistaken.

Wouldn't it be more efficient to write one (more complex, yet still easy in my scenario) filter function that logically combines the filter functions to achieve the same result?

I learned about filter in the context of Functional Programming which is supposed to make my code shorter and faster. That's why I probably didn't question what I was writing, thinking 'I am doing FP, this gotta be good'.

Thanks!

解决方案

Well, it does iterate six times, but not necessarily on the whole initial array. Each time it is filtered it becomes smaller. It would be more effective to have one filter method, but the difference might not be as great as you expect.

If you still want to use this solution, you can increase the performance by using the most selective (that is the filter that is expected filter out the most) first. That way, the following arrays will be smaller and there will be less to iterate through.

As @Redu points out (in comments) you can chain your filters using the || operator. This will make sure you only do one iteration.


The reason behind this is that Array.prototype.filter returns a new array. Compare this with the Java Stream API, that returns a stream, and thus can go "depth first" through the call list. The down side of this is that you need a terminal operation in the end, to "collect" your result.

In javascript

rawArray.filter(x)

iterates the rawArray and returns a new filtered array - which can in turn be filtered, or used as it is. It will result in a call to x for each of the elements in rawArray.

In Java the equivalent would be

rawArray.stream().filter(x)

which would actually not do anything at all at this point. No calls to x would be done. The return value would be a Stream, that can be used later. It can be further filtered, but it is not until the values are collected in some way - with a terminal operation - that calls are made.

Lets compare javascript

rawArray.filter(x).filter(y).length

to Java

rawArray.stream().filter(x).filter(y).count()

In javascript, this would first iterate over all of the elements of rawArray, calling x for each of them, and store the result in an intermediate array. Then the javascript engine would iterate over all of the elements of the intermediate array, calling y for each element, and store the result in a second intermediate array, which it would then check the size of.

In Java, the snippet would result in the VM iterating over the elements of rawArray, first calling x, and, if x is true, then calling y on each element, and, if still true incrementing the counter. There would be no intermediate arrays, and only one iteration over the dataset.

Functional programming is interesting, and when used properly, it creates less code that is less complex and ideally perhaps even a bit easier to read, but it does hand over a lot of responsibility to the framework (or engine or VM or whatever), and it is important to realize that seemingly similar code, while behaving similarly, can perform vastly differently in different environments.

这篇关于在大数组上使用多个".filter"调用是否不利于Javascript性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆