斯卡拉/ Python的主场迎战Java的:SparkContext.map主场迎战PI例如.filter? [英] Scala/Python vs. Java: SparkContext.map vs. .filter in PI example?

查看:333
本文介绍了斯卡拉/ Python的主场迎战Java的:SparkContext.map主场迎战PI例如.filter?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在位于 http://spark.apache.org/examples.html 郫县为例

在估算丕例如,有在Python /斯卡拉对Java的例子,我不理解的差异。 Python和Scala的同时使用Map和Reduce:

In the Estimating Pi example, there is a discrepancy in the Python/Scala vs. Java example I don't understand. Python and Scala are both using map and reduce:

的Python

def sample(p):
    x, y = random(), random()
    return 1 if x*x + y*y < 1 else 0

count = spark.parallelize(xrange(0, NUM_SAMPLES)).map(sample) \
             .reduce(lambda a, b: a + b)
print "Pi is roughly %f" % (4.0 * count / NUM_SAMPLES)

斯卡拉

val count = spark.parallelize(1 to NUM_SAMPLES).map{i =>
  val x = Math.random()
  val y = Math.random()
  if (x*x + y*y < 1) 1 else 0
}.reduce(_ + _)
println("Pi is roughly " + 4.0 * count / NUM_SAMPLES)

但Java使用过滤器:

But Java is using filter:

int count = spark.parallelize(makeRange(1, NUM_SAMPLES)).filter(new    
  Function<Integer, Boolean>() {
    public Boolean call(Integer i) {
      double x = Math.random();
      double y = Math.random();
      return x*x + y*y < 1;
   }
}).count();
System.out.println("Pi is roughly " + 4 * count / NUM_SAMPLES);

这是只是一个文档错字/错误?在Java和图过滤器preferable /减少斯卡拉和Python pferred出于某种原因,$ P $?

Is this just a doc typo/bug? Is filter preferable in Java and map/reduce preferred in Scala and Python for some reason?

推荐答案

这些方法是等效的。 Java的code只计算情况下,斯卡拉/ Python的地图返回1.只是为了使其更透明一点:

These approaches are equivalent. Java code simply counts cases where Scala / Python map returns 1. Just to make it a little bit more transparent:

def inside(x, y):
    """Check if point (x, y) is inside a unit circle
    with center in the origin (0, 0)"""
    return x*x + y*y < 1

points = ... 

# Scala / Python code is equivalent to this
sum([1 if inside(x, y) else 0 for (x, y) in points])

# While Java code is equivalent to this
len([(x, y) for (x, y) in points if inside(x, y)])

最后总结你得到成正比的包围平方由圆圈和从式覆盖我们知道它是等于π的面积的分数。

Finally sum you get is proportional to the fraction of the area of the enclosing square covered by the circle and from the formula we know it is equal π.

这篇关于斯卡拉/ Python的主场迎战Java的:SparkContext.map主场迎战PI例如.filter?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆