用Numba优化整数元组集的dict? [英] Optimizing dict of set of tuple of ints with Numba?

查看:491
本文介绍了用Numba优化整数元组集的dict?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习如何使用Numba(尽管我对Cython相当熟悉)。我应该如何加快这段代码的速度?请注意,该函数返回一个由两个整数组成的集合的字典。我正在使用IPython Notebook。与Cython相比,我更喜欢Numba。

  @autojit 
def generateadj(width,height):
adj = {}
y在范围(高度)中:
x在范围(宽度)中:
s = set()
如果x> 0:
s.add((x-1, y))
如果x< width-1:
s.add((x + 1,y))
如果y> 0:
s.add((x, y-1))
如果y s.add((x,y + 1))
adj [x,y] = s
return adj

我设法用Cython编写了此文件,但我不得不放弃数据的结构方式。我不喜欢这个。我在Numba文档中的某处读到它可以与列表,元组等基本内容一起使用。

  %% cython 
import numpy as np

def generateadj(int width,int height):
cdef int [:,:,:,:] adj = np.zeros(((width,height,4 ,2),np.int32)
cdef int计数

对于y在范围(高度):
对于x在范围(宽度):
count = 0如果x> 0,则

adj [x,y,count,0] = x-1
adj [x,y,count,1] = y
count + =如果x
adj [x,y,count,0] = x + 1
adj [x,y,count,1] = y
如果y> 0,则计数+ = 1

adj [x,y,count,0] = x
adj [x,y,count,1] = y-1
计数+ = 1
如果y adj [x,y,count,0] = x
adj [x,y,count,1] = y + 1
个数+ = 1
对于范围(count,4)中的i:
adj [x,y,i] = adj [x,y,0]
返回adj


解决方案

同时 numba 支持Python数据结构为 dict s和 set s,它在对象模式中执行。从 numba 词汇表中,对象模式定义为:


Numba编译模式会生成将所有值
作为Python对象处理的代码,并使用Python C API对这些对象执行所有操作
。在对象模式下编译的代码通常不会比Python解释的代码运行
更快,除非Numba编译器可以使
充分利用循环跳转功能。


因此,在编写 numba 代码时,您需要坚持使用内置数据类型,例如数组。这是一些执行此操作的代码:

  @jit 
def gen_adj_loop(width,height,adj):
i = 0
对于x在范围(宽度)中:
对于y在范围(高度)中:
如果x> 0:
adj [i,1] = x
adj [i,2] = y
adj [i,2] = x-1
adj [i,3] = x≤y
i + = 1

宽度-1:
adj [i,1] = x
adj [i,2] = y
adj [i,2] = x + 1
adj [i, 3] = y
i + = 1

如果y> 0:
adj [i,1] = x
adj [i,2] = y
adj [i,2] = x
adj [i,3] = y -如果y< 1
i + = 1

高度-1:
adj [i,0] = x
adj [i,1] = y
adj [i,2] = x
adj [i,3] = y + 1
i + = 1
返回

这需要一个数组 adj 。每行的格式为 x y adj_x adj_y 。因此,对于(3,4)处的像素,我们将有四行:

  3 4 2 4 
3 4 4 4
3 4 3 3
3 4 3 5

我们可以将上述函数包装在另一个函数中:

  @jit 
def gen_adj(宽度,高度):
#每个像素有四个邻居,但是其中一些邻居是
#不在网格中-2 * width + 2 * heights are same
n_entries =宽度*高度* 4-2 *宽度-2 *高度
adj = np.zeros(((n_entries,4),dtype = int)
gen_adj_loop(宽度,高度,距离)

此功能非常快速,但不完整。我们必须将 adj 转换为您问题中形式的字典。问题在于这是一个非常缓慢的过程。我们必须遍历 adj 数组,并将每个条目添加到Python字典中。

所以底线是这样的:要求结果是字典的元组确实限制了您可以优化此代码的数量。


I am learning how to use Numba (while I am already fairly familiar with Cython). How should I go about speeding up this code? Notice the function returns a dict of sets of two-tuples of ints. I am using IPython notebook. I would prefer Numba over Cython.

@autojit
def generateadj(width,height):
    adj = {}
    for y in range(height):
        for x in range(width):
            s = set()
            if x>0:
                s.add((x-1,y))
            if x<width-1:
                s.add((x+1,y))
            if y>0:
                s.add((x,y-1))
            if y<height-1:
                s.add((x,y+1))
            adj[x,y] = s
    return adj

I managed to write this in Cython but I had to give up on the way data is structured. I do not like this. I read somewhere in Numba documentation that it can work with basic things like lists, tuples, etc.

%%cython
import numpy as np

def generateadj(int width, int height):
    cdef int[:,:,:,:] adj = np.zeros((width,height,4,2), np.int32)
    cdef int count

    for y in range(height):
        for x in range(width):
            count = 0
            if x>0:
                adj[x,y,count,0] = x-1
                adj[x,y,count,1] = y
                count += 1
            if x<width-1:
                adj[x,y,count,0] = x+1
                adj[x,y,count,1] = y
                count += 1
            if y>0:
                adj[x,y,count,0] = x
                adj[x,y,count,1] = y-1
                count += 1
            if y<height-1:
                adj[x,y,count,0] = x
                adj[x,y,count,1] = y+1
                count += 1
            for i in range(count,4):
                adj[x,y,i] = adj[x,y,0]
    return adj

解决方案

While numba supports such Python data structures as dicts and sets, it does so in object mode. From the numba glossary, object mode is defined as:

A Numba compilation mode that generates code that handles all values as Python objects and uses the Python C API to perform all operations on those objects. Code compiled in object mode will often run no faster than Python interpreted code, unless the Numba compiler can take advantage of loop-jitting.

So when writing numba code, you need to stick to built-in data types such as arrays. Here's some code that does just that:

@jit
def gen_adj_loop(width, height, adj):
    i = 0
    for x in range(width):
        for y in range(height):
            if x > 0:
                adj[i,0] = x
                adj[i,1] = y
                adj[i,2] = x - 1
                adj[i,3] = y
                i += 1

            if x < width - 1:
                adj[i,0] = x
                adj[i,1] = y
                adj[i,2] = x + 1
                adj[i,3] = y
                i += 1

            if y > 0:
                adj[i,0] = x
                adj[i,1] = y
                adj[i,2] = x
                adj[i,3] = y - 1
                i += 1

            if y < height - 1:
                adj[i,0] = x
                adj[i,1] = y
                adj[i,2] = x
                adj[i,3] = y + 1
                i += 1
    return

This takes an array adj. Each row has the form x y adj_x adj_y. So for the pixel at (3,4), we'd have the four rows:

3 4 2 4
3 4 4 4
3 4 3 3
3 4 3 5

We can wrap the above function in another:

@jit
def gen_adj(width, height):
    # each pixel has four neighbors, but some of these neighbors are
    # off the grid -- 2*width + 2*height of them to be exact
    n_entries = width*height*4 - 2*width - 2*height
    adj = np.zeros((n_entries, 4), dtype=int)
    gen_adj_loop(width, height, adj)

This function is very fast, but incomplete. We must convert adj to a dictionary of the form in your question. The problem is that this is a very slow process. We must iterate over the adj array and add each entry to a Python dictionary. This cannot be jitted by numba.

So the bottom line is this: the requirement that the result is a dictionary of tuples really constrains how much you can optimize this code.

这篇关于用Numba优化整数元组集的dict?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆