Scala - 遍历两个数组 [英] Scala - Iterate Over Two Arrays

查看:55
本文介绍了Scala - 遍历两个数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何迭代两个相同大小的数组,每次迭代访问相同的索引 The Scala Way™?

How do you iterate over two arrays of the same size, accessing the same index each iteration The Scala Way™?

      for ((aListItem, bListItem) <- (aList, bList)) {
         // do something with items
      }

Java 方式应用于 Scala:

The Java way applied to Scala:

     for(i <- 0 until aList.length ) {
          aList(i)
          bList(i)
      }

假设两个列表的大小相同.

Assume both lists are the same size.

推荐答案

tl;dr:在速度和便利性之间需要权衡;您需要了解您的用例以进行适当的选择.

tl;dr: There are trade-offs between speed and convenience; you need to know your use case to pick appropriately.

如果你知道两个数组的长度相同并且你不需要担心它有多快,最简单和最规范的方法是在 for-comprehension 中使用 zip :

If you know both arrays are the same length and you don't need to worry how fast it is, the easiest and most canonical is to use zip inside a for-comprehension:

for ((a,b) <- aList zip bList) { ??? }

zip 方法会创建一个新的单个数组.为了避免这种开销,您可以在元组上使用 zipped ,它会将元素成对呈现给 foreachmap 之类的方法:

The zip method creates a new single array, however. To avoid that overhead you can use zipped on a tuple which will present the elements in pairs to methods like foreach and map:

(aList, bList).zipped.foreach{ (a,b) => ??? }

更快的仍然是对数组进行索引,特别是如果数组包含像 Int 这样的原语,因为上面的通用代码必须将它们装箱.您可以使用一个方便的方法 indices:

Faster still is to index into the arrays, especially if the arrays contain primitives like Int, since the generic code above has to box them. There is a handy method indices that you can use:

for (i <- aList.indices) { ??? }

最后,如果您需要尽可能快地运行,您可以使用手动 while 循环或递归,如下所示:

Finally, if you need to go as fast as you possibly can, you can fall back to manual while loops or recursion, like so:

// While loop
var i = 0
while (i < aList.length) {
  ???
  i += 1
}

// Recursion
def loop(i: Int) {
  if (i < aList.length) {
    ???
    loop(i+1)
  }
}
loop(0)

如果您正在计算某个值,而不是让它成为副作用,那么如果您传递它,有时递归会更快:

If you are computing some value, rather than having it be a side effect, it's sometimes faster with recursion if you pass it along:

// Recursion with explicit result
def loop(i: Int, acc: Int = 0): Int =
  if (i < aList.length) {
    val nextAcc = ???
    loop(i+1, nextAcc)
  }
  else acc

由于您可以在任何地方删除方法定义,因此您可以不受限制地使用递归.您可以添加一个 @annotation.tailrec 注释,以确保它可以编译成带有跳转的快速循环,而不是占用堆栈空间的实际递归.

Since you can drop a method definition in anywhere, you can use recursion without restriction. You can add an @annotation.tailrec annotation to make sure it can be compiled down to a fast loop with jumps instead of actual recursion that eats stack space.

采用所有这些不同的方法来计算长度为 1024 的向量的点积,我们可以将它们与 Java 中的参考实现进行比较:

Taking all these different approaches to calculate a dot product on length 1024 vectors, we can compare these to a reference implementation in Java:

public class DotProd {
  public static int dot(int[] a, int[] b) {
    int s = 0;
    for (int i = 0; i < a.length; i++) s += a[i]*b[i];
    return s;
  }
}

加上一个等效的版本,我们采用字符串长度的点积(因此我们可以评估对象与基元)

plus an equivalent version where we take the dot product of the lengths of strings (so we can assess objects vs. primitives)

normalized time
-----------------
primitive  object  method
---------  ------  ---------------------------------
 100%       100%   Java indexed for loop (reference)
 100%       100%   Scala while loop
 100%       100%   Scala recursion (either way)
 185%       135%   Scala for comprehension on indices
2100%       130%   Scala zipped
3700%       800%   Scala zip

特别很糟糕,当然,对于原语!(如果您尝试使用 IntegerArrayLists 而不是 intArray> 在 Java 中.)请特别注意,如果您存储了对象,zipped 是一个相当合理的选择.

This is particularly bad, of course, with primitives! (You get similarly huge jumps in time taken if you try to use ArrayLists of Integer instead of Array of int in Java.) Note in particular that zipped is quite a reasonable choice if you have objects stored.

尽管如此,请注意过早优化!zip 等功能形式在清晰和安全方面具有优势.如果您总是因为认为每一点都有帮助"而总是编写 while 循环,那么您可能会犯错误,因为编写和调试需要更多时间,而您可能会利用这段时间来优化程序中更重要的部分.

Do beware of premature optimization, though! There are advantages to in clarity and safety to functional forms like zip. If you always write while loops because you think "every little bit helps", you're probably making a mistake because it takes more time to write and debug, and you could be using that time optimizing some more important part of your program.

但是,假设您的数组长度相同是危险的.你确定吗?你会付出多少努力来确定?也许你不应该做出这样的假设?

But, assuming your arrays are the same length is dangerous. Are you sure? How much effort will you make to be sure? Maybe you shouldn't make that assumption?

如果你不需要它快,只要正确,那么你必须选择如果两个数组的长度不一样怎么办.

If you don't need it to be fast, just correct, then you have to choose what to do if the two arrays are not the same length.

如果你想对所有元素做一些直到较短的长度,那么 zip 仍然是你使用的:

If you want to do something with all the elements up to the length of the shorter, then zip is still what you use:

// The second is just shorthand for the first
(aList zip bList).foreach{ case (a,b) => ??? }
for ((a,b) <- (aList zip bList)) { ??? }

// This avoids an intermediate array
(aList, bList).zipped.foreach{ (a,b) => ??? }

如果你想用默认值填充较短的,你会

If you instead want to pad the shorter one with a default value, you would

aList.zipAll(bList, aDefault, bDefault).foreach{ case (a,b) => ??? }
for ((a,b) <- aList.zipAll(bList, aDefault, bDefault)) { ??? }

在任何这些情况下,您都可以使用 yieldformap 而不是 foreach 来生成一个集合.

In any of these cases, you can use yield with for or map instead of foreach to generate a collection.

如果您需要索引进行计算,或者它确实是一个数组并且您确实需要它的速度,那么您将不得不手动进行计算.填充缺失的元素很尴尬(我把它留给读者作为练习),但基本形式是:

If you need the index for a calculation or it really is an array and you really need it to be fast, you will have to do the calculation manually. Padding missing elements is awkward (I leave that as an exercise to the reader), but the basic form would be:

for (i <- 0 until math.min(aList.length, bList.length)) { ??? }

然后使用 i 索引到 aListbList.

where you then use i to index into aList and bList.

如果您真的需要最大速度,您将再次使用(尾)递归或 while 循环:

If you really need maximum speed you would again use (tail) recursion or while loops:

val n = math.min(aList.length, bList.length)
var i = 0
while (i < n) {
  ???
  i += 1
}

def loop(i: Int) {
  if (i < aList.length && i < bList.length) {
    ???
    loop(i+1)
  }
}
loop(0)

这篇关于Scala - 遍历两个数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆