Map[..] 上的 Scala map() 比 mapValues() 慢得多 [英] Scala map() on a Map[..] much slower than mapValues()

查看:55
本文介绍了Map[..] 上的 Scala map() 比 mapValues() 慢得多的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我编写的 Scala 程序中,我有一个 scala.collection.Map 将字符串映射到一些计算值(详细来说它是 Map[String, (Double, immutable.Map[String、Double]、Double)] - 我知道这很丑陋,应该(并且将)包装).现在,如果我这样做:

In a Scala program I wrote I have a scala.collection.Map that maps a String to some calculated values (in detail it's Map[String, (Double, immutable.Map[String, Double], Double)] - I know that's ugly and should (and will be) wrapped). Now, if I do this:

stats.map { case(c, (prior, pwc, denom)) => {
  println(c)
  ...
  }
}

打印出 c 值的大约 50 倍大约需要 30 秒!println 只是一个测试语句 - 我需要的实际计算甚至更慢(我在完全沉默 1 分钟后中止).但是,如果我这样做:

it takes about 30 seconds to print out roughly 50 times a value of c! The println is just a test statement - the actual calculation I need was even slower (I aborted after 1 minute of complete silence). However, if I do it like this:

stats.mapValues { case (prior, pwc, denom) => {
  println(prior)
  ...
  }
}

我没有遇到这些性能问题...谁能解释为什么会这样?我是否没有遵循一些重要的 Scala 指南?

I don't run into these performance issues ... Can anyone explain why this is happening? Am I not following some important Scala guidelines?

感谢您的帮助!

我进一步调查了这种行为.我的猜测是瓶颈来自 Map 数据结构的访问.如果我执行以下操作,我会遇到相同的性能问题:

I further investigated the behaviour. My guess is that the bottleneck comes from accessin the Map datastructure. If I do the following, I have have the same performance issues:

classes.foreach{c => {
  println(c)
  val ps = stats(c)
  }
}

这里的 classes 是一个 List[String],它在外部存储 Map 的键.如果无法访问 stats(c),则不会发生性能损失.

Here classes is a List[String] that stores the keys of the Map externally. Without the access to stats(c) no performance losses occur.

推荐答案

mapValues 实际上返回原始地图上的视图,这可能会导致意外的性能问题.来自这篇博文:

mapValues actually returns a view on the original map, which can lead to unexpected performance issues. From this blog post:

...这里有一个问题:map 和 mapValues 在一个不那么微妙的地方是不同的办法.mapValues 与 map 不同,它返回原始地图上的视图.这视图包含对原始地图和对转换函数(此处为 (_ + 1)).每次返回的地图(view) 被查询,首先查询原始地图,然后对结果调用转换函数.

...here is a catch: map and mapValues are different in a not-so-subtle way. mapValues, unlike map, returns a view on the original map. This view holds references to both the original map and to the transformation function (here (_ + 1)). Every time the returned map (view) is queried, the original map is first queried and the tranformation function is called on the result.

我建议阅读该帖子的其余部分以了解更多详细信息.

I recommend reading the rest of that post for some more details.

这篇关于Map[..] 上的 Scala map() 比 mapValues() 慢得多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆