使用itertools.product并想要种子值 [英] Using itertools.product and want to seed a value

查看:166
本文介绍了使用itertools.product并想要种子值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我写了一个小脚本来从网站下载图片。它通过一个7个alpha charactor值,其中第一个char总是一个数字。问题是如果我想停止脚本并重新启动,我必须从头开始。

So I've wrote a small script to download pictures from a website. It goes through a 7 alpha charactor value, where the first char is always a number. The problem is if I want to stop the script and start it up again I have to start all over.

我可以使用最后一个值来种子itertools.product所以我不必再经过他们了。

Can I seed itertools.product somehow with the last value I got so I don't have to go through them all again.

感谢任何输入。

这里是代码的一部分:

numbers = '0123456789'
alnum = numbers + 'abcdefghijklmnopqrstuvwxyz'

len7 = itertools.product(numbers, alnum, alnum, alnum, alnum, alnum, alnum) # length 7

for p in itertools.chain(len7):
    currentid = ''.join(p) 

    #semi static vars
    url = 'http://mysite.com/images/'
    url += currentid

    #Need to get the real url cause the redirect
    print "Trying " + url
    req = urllib2.Request(url)
    res = openaurl(req)
    if res == "continue": continue
    finalurl = res.geturl()

    #ok we have the full url now time to if it is real
    try: file = urllib2.urlopen(finalurl)
    except urllib2.HTTPError, e:
        print e.code

    im = cStringIO.StringIO(file.read())
    img = Image.open(im)
    writeimage(img)


推荐答案

这里是一个基于pypy的库代码的解决方案(感谢agf在评论中的建议)。

here's a solution based on pypy's library code (thanks to agf's suggestion in the comments).

状态可通过 .state 属性,可以通过 .goto(state)重置,其中 state 是序列中的索引(从0开始)。最后有一个演示(你需要向下滚动,我很害怕)。

the state is available via the .state attribute and can be reset via .goto(state) where state is an index into the sequence (starting at 0). there's a demo at the end (you need to scroll down, i'm afraid).

这比丢弃值更快。

> cat prod.py 

class product(object):

    def __init__(self, *args, **kw):
        if len(kw) > 1:
            raise TypeError("product() takes at most 1 argument (%d given)" %
                             len(kw))
        self.repeat = kw.get('repeat', 1)
        self.gears = [x for x in args] * self.repeat
        self.num_gears = len(self.gears)
        self.reset()

    def reset(self):
        # initialization of indicies to loop over
        self.indicies = [(0, len(self.gears[x]))
                         for x in range(0, self.num_gears)]
        self.cont = True
        self.state = 0

    def goto(self, n):
        self.reset()
        self.state = n
        x = self.num_gears
        while n > 0 and x > 0:
            x -= 1
            n, m = divmod(n, len(self.gears[x]))
            self.indicies[x] = (m, self.indicies[x][1])
        if n > 0:
            self.reset()
            raise ValueError("state exceeded")

    def roll_gears(self):
        # Starting from the end of the gear indicies work to the front
        # incrementing the gear until the limit is reached. When the limit
        # is reached carry operation to the next gear
        self.state += 1
        should_carry = True
        for n in range(0, self.num_gears):
            nth_gear = self.num_gears - n - 1
            if should_carry:
                count, lim = self.indicies[nth_gear]
                count += 1
                if count == lim and nth_gear == 0:
                    self.cont = False
                if count == lim:
                    should_carry = True
                    count = 0
                else:
                    should_carry = False
                self.indicies[nth_gear] = (count, lim)
            else:
                break

    def __iter__(self):
        return self

    def next(self):
        if not self.cont:
            raise StopIteration
        l = []
        for x in range(0, self.num_gears):
            index, limit = self.indicies[x]
            l.append(self.gears[x][index])
        self.roll_gears()
        return tuple(l)

p = product('abc', '12')
print list(p)
p.reset()
print list(p)
p.goto(2)
print list(p)
p.goto(4)
print list(p)
> python prod.py 
[('a', '1'), ('a', '2'), ('b', '1'), ('b', '2'), ('c', '1'), ('c', '2')]
[('a', '1'), ('a', '2'), ('b', '1'), ('b', '2'), ('c', '1'), ('c', '2')]
[('b', '1'), ('b', '2'), ('c', '1'), ('c', '2')]
[('c', '1'), ('c', '2')]

你应该测试更多 - 我可能会犯了一个愚蠢的错误 - 但是想法很简单,所以你应该可以解决它:o)你可以自由地使用我的更改;不知道原始的pypy许可证是什么。

you should test it more - i may have made a dumb mistake - but the idea is quite simple, so you should be able to fix it :o) you're free to use my changes; no idea what the original pypy licence is.

状态不是真正的完整状态 - t包含原始参数 - 它只是序列中的索引。也许这可以称之为索引,但代码中已经有标记...

also state isn't really the full state - it doesn't include the original arguments - it's just an index into the sequence. maybe it would have been better to call it index, but there are already indici[sic]es in the code...

更新

这是一个更简单的版本,是一样的想法,但可以通过转换一系列数字。所以你只需要 imap 它超过 count(n)来获得序列偏移量 n

here's a simpler version that is the same idea but works by transforming a sequence of numbers. so you just imap it over count(n) to get the sequence offset by n.

> cat prod2.py 

from itertools import count, imap

def make_product(*values):
    def fold((n, l), v):
        (n, m) = divmod(n, len(v))
        return (n, l + [v[m]])
    def product(n):
        (n, l) = reduce(fold, values, (n, []))
        if n > 0: raise StopIteration
        return tuple(l)
    return product

print list(imap(make_product(['a','b','c'], [1,2,3]), count()))
print list(imap(make_product(['a','b','c'], [1,2,3]), count(3)))

def product_from(n, *values):
    return imap(make_product(*values), count(n))

print list(product_from(4, ['a','b','c'], [1,2,3]))

> python prod2.py 
[('a', 1), ('b', 1), ('c', 1), ('a', 2), ('b', 2), ('c', 2), ('a', 3), ('b', 3), ('c', 3)]
[('a', 2), ('b', 2), ('c', 2), ('a', 3), ('b', 3), ('c', 3)]
[('b', 2), ('c', 2), ('a', 3), ('b', 3), ('c', 3)]

(这里的缺点是,如果你想停止并重新启动,你需要跟踪自己有多少你使用)

(the downside here is that if you want to stop and restart you need to have kept track yourself of how many you have used)

这篇关于使用itertools.product并想要种子值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆