如何在Spark控制台中对性能进行基准测试? [英] How can I benchmark performance in Spark console?

查看:46
本文介绍了如何在Spark控制台中对性能进行基准测试?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚开始使用Spark,目前与它的互动围绕着 spark-shell .我想对各种命令花费多长时间进行基准测试,但是找不到如何获取时间或运行基准测试.理想情况下,我想做一些超级简单的事情,例如:

I have just started using Spark and my interactions with it revolve around spark-shell at the moment. I would like to benchmark how long various commands take, but could not find how to get the time or run a benchmark. Ideally I would want to do something super-simple, such as:

val t = [current_time]
data.map(etc).distinct().reduceByKey(_ + _)
println([current time] - t)

弄清楚-

import org.joda.time._
val t_start = DateTime.now()
[[do stuff]]
val t_end = DateTime.now()
new Period(t_start, t_end).toStandardSeconds()

推荐答案

我建议您执行以下操作:

I suggest you do the following :

def time[A](f: => A) = {
  val s = System.nanoTime
  val ret = f
  println("time: " + (System.nanoTime - s) / 1e9 + " seconds")
  ret
}

您可以将一个函数作为时间函数的参数传递,它将计算该函数的结果,从而为您提供该函数要执行的时间.

You can pass a function as an argument to time function and it will compute the result of the function giving you the time taken by the function to be performed.

让我们考虑一个函数 foobar ,该函数将数据作为参数,然后执行以下操作:

Let's consider a function foobar that take data as argument and then do the following :

val test = time(foobar(data))

test 将包含 foobar 的结果,您还将获得所需的时间.

test will contains the result of foobar and you'll get the time needed as well.

这篇关于如何在Spark控制台中对性能进行基准测试?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆