用于生成素数的并行算法(可能使用Hadoop的map reduce) [英] Parallel Algorithms for Generating Prime Numbers (possibly using Hadoop's map reduce)

查看:233
本文介绍了用于生成素数的并行算法(可能使用Hadoop的map reduce)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

生成素数是一个玩具问题,我经常尝试,尤其是在尝试新的编程语言,平台或风格时。



我在想尝试使用Hadoop(Map Reduce)编写素数生成算法或素数测试算法。



我以为我会发布此问题以获取提示,参考,到算法,方法。

虽然我的主要兴趣是基于Map Reduce的算法,但我并不介意查看新的Hadoop编程模型,或者例如使用



其中每个素数都表示该素数的奇数倍的流,例如对于 7 49,49 + 14,49 + 28, ...,它们全部合并得到所有合成数字,然后 primes em>在复合数字之间的间隙中找到。它在Haskell中,因此时间由惰性评估机制隐式地处理(以及算法调整,其中每个比较节点总是从左侧通过第一个号码而不要求数字从右侧开始,因为无论如何它肯定会变得更大)

赔率可以用来代替奇素数作为种子对于多次生成,可以简化一些事情(具有明显的性能影响)。



这项工作自然可以分成连续的素数平方之间的片段。 Haskell代码如下,但我们也可以把它看作一个可执行的伪代码(其中是一个列表节点的惰性构造函数,一个函数调用 f (x)写成 fx ,括号仅用于分组)

  primes = 2:g [] 
其中
g ps = 3:minus [5,7 ..]( _U((x:xs):t)= x:union xs(_U(pairs t(p) ))
pair((x:xs):ys:t)=(x:union xs ys):pairs t

union(x:xs)(y:ys)= case比较
LT的xy - > x:联合xs(y:ys)
EQ - > x:union xs ys
GT - > y:联合(x:xs)ys

减号(x:xs)(y:ys)=
LT - >的情况比较x y x:减xs(y:ys)
EQ - >减去xs ys
GT - >减号(x:xs)ys

讨论是这里。更复杂,更懒惰的日程安排是此处。另外这个SO答案显示(相关的)Haskell代码在发生器方面的大致翻译; 这个在Python中。


Generating Prime numbers is a toy problem that I often attempt from time to time, especially when experimenting with a new programming language, platform or style.

I was thinking of attempting to write a Prime Number Generation algorithm or a Prime Number Test Algorithm using Hadoop (Map Reduce).

I thought I'd post this question to get tips, references, to algorithms, approaches.

Although my primary interest is a Map Reduce based algorithm I wouldn't mind looking at new Hadoop programming models or for example looking at using PiCloud

I have seems some interesting questions here on Prime Number Generation: here, here and here, but nothing related to a Parallel approach caught my eye.

Thanks in advance.

解决方案

Here's an algorithm that is built on mapping and reducing (folding). It expresses the sieve of Eratosthenes

     P = {3,5,7, ...} \ U {{p2, p2+2p, p2+4p, ...} | p in P}

for the odd primes (i.e without the 2). The folding tree is indefinitely deepening to the right, like this:

where each prime number marks a stream of odd multiples of that prime, e.g. for 7: 49, 49+14, 49+28, ... , which are all merged to get all the composite numbers, and then primes are found in the gaps between the composite numbers. It is in Haskell, so the timing is taken care of implicitly by the lazy evaluation mechanism (and the algorithmic adjustment where each comparing node always lets through the first number from the left without demanding a number from the right, because it is guaranteed to be bigger anyway).

Odds can be used instead of odd primes as seeds for multiples generation, to simplify things (with obvious performance implications).

The work can naturally be divided into segments between consecutive primes' squares. Haskell code follows, but we can regard it as an executable pseudocode too (where : is a list node lazy constructor, a function call f(x) is written f x, and parentheses are used for grouping only):

primes = 2 : g []
  where
    g ps = 3 : minus [5,7..] (_U [[p*p, p*p+2*p..] | p <- g ps])
    _U ((x:xs):t) = x : union xs (_U (pairs t))
    pairs ((x:xs):ys:t) = (x : union xs ys) : pairs t

union (x:xs) (y:ys) = case compare x y of 
    LT -> x : union  xs (y:ys) 
    EQ -> x : union  xs    ys 
    GT -> y : union (x:xs) ys

minus (x:xs) (y:ys) = case compare x y of
    LT -> x : minus  xs (y:ys) 
    EQ ->     minus  xs    ys 
    GT ->     minus (x:xs) ys

A discussion is here. More sophisticated, lazier scheduling is here. Also this SO answer shows approximate translation of (related) Haskell code in terms of generators; this one in Python.

这篇关于用于生成素数的并行算法(可能使用Hadoop的map reduce)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆