Scala:从元组数组/ RDD中获取第n个元素的总和 [英] Scala: Get sum of nth element from tuple array/RDD

查看:3879
本文介绍了Scala:从元组数组/ RDD中获取第n个元素的总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 tuple 的数组:

val a = Array((1,2,3), (2,3,4))

I想为以下方法编写一个通用方法:

I want to write a generic method for a method like below:

def sum2nd(aa: Array[(Int, Int, Int)]) = {
      aa.map { a => a._2 }.sum
      }

所以我在寻找一个方法:

So what I am looking for a method like:

def sumNth(aa: Array[(Int, Int, Int)], n: Int)


推荐答案

有几种方法可以解决这个问题。最简单的是使用 productElement

There are a few ways you can go about this. The simplest is to use productElement:

def unsafeSumNth[P <: Product](xs: Seq[P], n: Int): Int =
  xs.map(_.productElement(n).asInstanceOf[Int]).sum

然后(注意,索引从零开始,因此 n = 1 元素):

And then (note that indexing starts at zero, so n = 1 gives us the second element):

scala> val a = Array((1, 2, 3), (2, 3, 4))
a: Array[(Int, Int, Int)] = Array((1,2,3), (2,3,4))

scala> unsafeSumNth(a, 1)
res0: Int = 5

有两种不同的方式:

scala> unsafeSumNth(List((1, 2), (2, 3)), 3)
java.lang.IndexOutOfBoundsException: 3
  at ...

scala> unsafeSumNth(List((1, "a"), (2, "b")), 1)
java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer
  at ...



如果元组没有足够的元素,您要求的不是 Int

您可以编写一个不会崩溃的版本运行时:

You can write a version that doesn't crash at runtime:

import scala.util.Try

def saferSumNth[P <: Product](xs: Seq[P], n: Int): Try[Int] = Try(
  xs.map(_.productElement(n).asInstanceOf[Int]).sum
)

然后:

scala> saferSumNth(a, 1)
res4: scala.util.Try[Int] = Success(5)

scala> saferSumNth(List((1, 2), (2, 3)), 3)
res5: scala.util.Try[Int] = Failure(java.lang.IndexOutOfBoundsException: 3)

scala> saferSumNth(List((1, "a"), (2, "b")), 1)
res6: scala.util.Try[Int] = Failure(java.lang.ClassCastException: ...

这是一个改进,因为它迫使呼叫者解决失败的可能性,但它也是恼人,因为它会迫使呼叫者解决失败的可能性。

This is an improvement, since it forces callers to address the possibility of failure, but it's also kind of annoying, since it forces callers to address the possibility of failure.

如果您愿意使用 Shapeless 你可以拥有两个世界的最好:

If you're willing to use Shapeless you can have the best of both worlds:

import shapeless._, shapeless.ops.tuple.At

def sumNth[P <: Product](xs: Seq[P], n: Nat)(implicit
  atN: At.Aux[P, n.N, Int]
): Int = xs.map(p => atN(p)).sum

然后:

scala> sumNth(a, 1)
res7: Int = 5

但是坏的甚至不编译:

scala> sumNth(List((1, 2), (2, 3)), 3)
<console>:17: error: could not find implicit value for parameter atN: ...

这还不完美,因为这意味着第二个参数有(因为它需要在编译时知道):

This still isn't perfect, though, since it means the second argument has to be a literal number (since it needs to be known at compile time):

scala> val x = 1
x: Int = 1

scala> sumNth(a, x)
<console>:19: error: Expression x does not evaluate to a non-negative Int literal
       sumNth(a, x)
                 ^

在很多情况下,这不是问题。

In many cases that's not a problem, though.

总而言之:如果你愿意承担合理的代码崩溃你的程序的责任,使用 productElement 。如果你想要更安全一些(以一些不便为代价),用 productElement 尝试。如果你想要编译时安全(但有一些限制),使用Shapeless。

To sum up: If you're willing to take responsibilty for reasonable code crashing your program, use productElement. If you want a little more safety (at the cost of some inconvenience), use productElement with Try. If you want compile-time safety (but some limitations), use Shapeless.

这篇关于Scala:从元组数组/ RDD中获取第n个元素的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆