在Python中合并惰性流(使用生成器) [英] Merge of lazy streams (using generators) in Python

查看:165
本文介绍了在Python中合并惰性流(使用生成器)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在玩Python 3的功能性能力,并试图实现用于计算汉明数字的经典算法。这是只有2,3或5个主要因素的数字。第一个汉明数字是2,3,4,5,6,8,10,12,15,16,18,20等等。



我的实现如下:

$ p $ def scale(s,m):
return(x * m for x in s)

def merge(s1,s2):
it1,it2 = iter(s1),iter(s2)
x1,x2 = next(it1),next(it2)
if x1 < x2:
x = x1
it = iter(merge(it1,s2))
elif x1> x2:
x = x2
it = iter(merge(s1,it2))
else:
x = x1
it = iter(merge(it1,it2))
产生x
,而True:产生下一个(它)

def整数():
n = 0
而真:
n + = 1
产量n

m2 =规模(整数(),2)
m3 =规模(整数(),3)
m5 = ,5)

m23 =合并(m2,m3)

hamming_numbers =合并(m23,m5)

合并的问题似乎不起作用。在此之前,我以同样的方式实施了Eratosthenes的Sieve,并且它的工作非常好:

  def筛选:
it = iter(s)
x = next(it)
yield x
it iter(sieve(filter(lambda y:x%y,it)))
while True:yield next(it)

这一个使用与我的合并操作相同的技术。所以我看不出有什么区别。你是否有任何想法?



(我知道所有这些都可以通过其他方式实现,但我的目标正是为了理解生成器和纯功能性能力,包括递归, Python中,不使用类声明或特殊的预构建Python函数。)



UPD:对于Will Ness,这里是我在LISP(实际上是Racket) p>

 (define(scale str m)
(stream-map(lambda(x)(* xm))str))
$ b $(定义(整数 - 从n)
(流 - 利润
(整数 - 从(+ n 1))))

定义(合并s1 s2)
(let((x1(stream-first s1))
(x2(stream-first s2)))
(cond((< x1 x2)
(stream-cons x2(merge-stream-rest s1)s2)))
((> x1 x2)
(stream-cons x2(merge s1(stream-rest s2)) ))
(else
(stream-cons x1(merge-stream-rest s1)(stream-rest s2)))))))


德(整数(从1开始))

(定义汉明数字
(stream-cons 1(merge(scale hamming-numbers 2)
)数字3)
(比例hamming-numbers 5)))))


解决方案

你的算法不正确。您的 m2,m3,m5 应该缩放 hamming_numbers ,而不是整数 merge()调用 next(code> )无条件地支持它的参数,所以 都会先进一步。所以在产生第一个数字之后,例如 2 对于 m23 生成器,在下一次调用时将它的第一个参数看作 4(,6,8,...) ,第二个为 6(,9,12,...) / code> 即可。 3 已经不存在了。因此,它总是拉动它的两个参数,并总是返回1st的头部(测试条目在 http://ideone.com/ )



调用 iter()是完全多余的,它在这里没有增加任何内容。当我将其删除时( http://ideone.com/7tk85h ),该程序的作用完全相同,并产生完全相同(错误)的输出。通常 iter()用于创建惰性迭代器对象,但其参数已经是这样的生成器。
$ b $

sieve() iter() c $ c>( http://ideone.com/kYh7Di )。 sieve()已经定义了一个生成器,Python 3中的 filter()创建了一个函数的迭代器, (生成器 可迭代)。另见例如 Python生成器与迭代器之间的区别



我们可以这样做,而不是:

  def merge(s1,s2):
x1,x2 = next(s1),next(s2)
,而True:
如果x1 < x2:
收益x1
x1 =下一个(s1)
elif x1> x2:
产量x2
x2 =下一个(s2)
其他:
产量x1
x1,x2 =下一个(s1),下一个(s2)






递归本身在定义 sieve()函数。实际上它只是掩盖了那些代码的巨大缺陷。它产生的任何素数都将通过它下面的所有素数进行测试 - 但只有那些低于其平方根的素数才是真正需要的。我们可以非常容易地以非递归样式修复它( http://ideone.com/Qaycpe ):

  def sieve(s):#call as:sieve(integers_from(2))
x = next(s)
yield x
ps = sieve(integers_from(2))#独立素数提供
p = next(ps)
q = p * p; print((p,q))
,而True:
x = next(s)
while x yield x
x = next(s)
#here x == q
s = filter(lambda y,p = p:y%p,s)#过滤器创建推迟
p = next(ps)#直到在输入$中看到的p的平方b $ bq = p * p

现在,很多 更高效(另见:解释输出素数流的这段haskell代码)。

递归与否,只是代码的语法特征。实际的运行时结构是相同的 - filter()适配器在输入流的顶部悬挂 - 无论是在适当的时候,还是过早(所以我们结果太多了)。


I'm playing with functional capacities of Python 3 and I tried to implement classical algorithm for calculating Hamming numbers. That's the numbers which have as prime factors only 2, 3 or 5. First Hamming numbers are 2, 3, 4, 5, 6, 8, 10, 12, 15, 16, 18, 20 and so on.

My implementation is the following:

def scale(s, m):
    return (x*m for x in s)

def merge(s1, s2):
    it1, it2 = iter(s1), iter(s2)
    x1, x2 = next(it1), next(it2)
    if x1 < x2:
        x = x1
        it = iter(merge(it1, s2))
    elif x1 > x2:
        x = x2
        it = iter(merge(s1, it2))
    else:
        x = x1
        it = iter(merge(it1, it2))
    yield x
    while True: yield next(it)

def integers():
    n = 0
    while True:
        n += 1
        yield n

m2 = scale(integers(), 2)
m3 = scale(integers(), 3)
m5 = scale(integers(), 5)

m23 = merge(m2, m3)

hamming_numbers = merge(m23, m5)

The problem it that merge seems just doesn't work. Before that I implemented Sieve of Eratosthenes the same way, and it worked perfectly okay:

def sieve(s):
    it = iter(s)
    x = next(it)
    yield x
    it = iter(sieve(filter(lambda y: x % y, it)))
    while True: yield next(it)

This one uses the same techniques as my merge operation. So I can't see any difference. Do you have any ideas?

(I know that all of these can be implemented other ways, but my goal exactly to understand generators and pure functional capabilities, including recursion, of Python, without using class declarations or special pre-built Python functions.)

UPD: For Will Ness here's my implementation of this algorithms in LISP (Racket actually):

(define (scale str m)
  (stream-map (lambda (x) (* x m)) str))

(define (integers-from n)
  (stream-cons n
               (integers-from (+ n 1))))

(define (merge s1 s2)
  (let ((x1 (stream-first s1))
        (x2 (stream-first s2)))
    (cond ((< x1 x2)
           (stream-cons x1 (merge (stream-rest s1) s2)))
          ((> x1 x2)
           (stream-cons x2 (merge s1 (stream-rest s2))))
          (else
           (stream-cons x1 (merge (stream-rest s1) (stream-rest s2)))))))


(define integers (integers-from 1))

(define hamming-numbers
  (stream-cons 1 (merge (scale hamming-numbers 2)
                        (merge (scale hamming-numbers 3)
                               (scale hamming-numbers 5)))))

解决方案

Your algorithm is incorrect. Your m2, m3, m5 should be scaling hamming_numbers, not integers.

The major problem is this: your merge() calls next() for both its arguments unconditionally, so both get advanced one step. So after producing the first number, e.g. 2 for the m23 generator, on the next invocation it sees its 1st argument as 4(,6,8,...) and 2nd as 6(,9,12,...). The 3 is already gone. So it always pulls both its arguments, and always returns the head of the 1st (test entry at http://ideone.com/doeX2Q).

Calling iter() is totally superfluous, it adds nothing here. When I remove it (http://ideone.com/7tk85h), the program works exactly the same and produces exactly the same (wrong) output. Normally iter() serves to create a lazy iterator object, but its arguments here are already such generators.

There's no need to call iter() in your sieve() as well (http://ideone.com/kYh7Di). sieve() already defines a generator, and filter() in Python 3 creates an iterator from a function and an iterable (generators are iterable). See also e.g. Difference between Python's Generators and Iterators .

We can do the merge like this, instead:

def merge(s1, s2):
  x1, x2 = next(s1), next(s2)
  while True:
    if x1 < x2:
        yield x1
        x1 = next(s1)
    elif x1 > x2:
        yield x2
        x2 = next(s2)
    else:
        yield x1
        x1, x2 = next(s1), next(s2)


Recursion in itself is non-essential in defining the sieve() function too. In fact it only serves to obscure there an enormous deficiency of that code. Any prime it produces will be tested by all the primes below it in value - but only those below its square root are truly needed. We can fix it quite easily in a non-recursive style (http://ideone.com/Qaycpe):

def sieve(s):    # call as: sieve( integers_from(2))
    x = next(s)  
    yield x
    ps = sieve( integers_from(2))           # independent primes supply
    p = next(ps) 
    q = p*p       ; print((p,q))
    while True:
        x = next(s)
        while x<q: 
            yield x
            x = next(s)
        # here x == q
        s = filter(lambda y,p=p: y % p, s)  # filter creation postponed 
        p = next(ps)                        #   until square of p seen in input
        q = p*p 

This is now much, much, much more efficient (see also: Explain this chunk of haskell code that outputs a stream of primes ).

Recursive or not, is just a syntactic characteristic of a code. The actual run-time structures are the same - the filter() adaptors being hoisted on top of an input stream - either at the appropriate moments, or way too soon (so we'd end up with way too many of them).

这篇关于在Python中合并惰性流(使用生成器)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆