Pypy vs Python中的计数算法性能优化(Numpy vs List) [英] Counting Algorithm Performance Optimization in Pypy vs Python (Numpy vs List)

查看：98 发布时间：2020/5/21 21:04:37 python performance optimization numpy pypy

本文介绍了Pypy vs Python中的计数算法性能优化(Numpy vs List)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的期望是pypy可能比python快一个数量级，但结果表明pypy实际上比预期的要慢.

My expectation was that pypy could be as much as an order of magnitude faster than python, but the results indicate that pypy is in fact slower than expected.

我有两个问题:

为什么pypy与pypy相比会明显变慢?
我可以做些什么来优化算法以使pypy(或python)更快?

产生的时间:

Python 2.7.5

#点:16,777,216(8 ** 3 * 32 ** 3)
Xrange时间:1487.15 ms
Xrange numpy时间:2553.98 ms
点生成时间:6162.23毫秒
Numpy Gen时间:13894.73毫秒

Pypy 2.2.1

#点:16,777,216(8 ** 3 * 32 ** 3)
Xrange时间:129.48毫秒
Xrange numpy时间:4644.12毫秒
点生成时间:4643.82毫秒
Numpy Gen时间:44168.98毫秒

算法:

我正在使用一种简单的算法生成一个空间点列表，并且正在尝试优化该算法.

I am generating a list of points in space using a simple algorithm and I'm trying to optimize the algorithm.

def generate(size=32, point=(0, 0, 0), width=32):
    """
    generate points in space around a center point with a specific width and
      number of divisions (size)
    """
    X, Y, Z = point
    half = width * 0.5
    delta = width
    scale = width / size
    offset = scale * 0.5
    X = X + offset - half
    Y = Y + offset - half
    Z = Z + offset - half
    for x in xrange(size):
        x = (x * scale) + X
        for y in xrange(size):
            y = (y * scale) + Y
            for z in xrange(size):
                z = (z * scale) + Z
                yield (x, y, z)

在优化中，我开始考虑使用pypy而不是python.在比较两者时，我提出了几种不同的方案:

In optimizing, I started looking at using pypy rather than python. And in comparing the two, I came up with a few different scenarios:

使用xrange计数

counting using xrange

rsize = 8    # size of region
csize = 32   # size of chunk
number_of_points = rsize ** 3 * csize ** 3
[x for x in xrange(number_of_points)]

使用numpy和xrange进行计数

counting using xrange with numpy

rsize = 8    # size of region
csize = 32   # size of chunk
number_of_points = rsize ** 3 * csize ** 3
np.array([x for x in xrange(number_of_points)])

使用上述算法运行

running with the algorithm above

rsize = 8    # size of region
csize = 32   # size of chunk
[p
 for rp in generate(size=rsize, width=rsize*csize)
 for p in generate(size=csize, width=csize, point=rp)]

使用numpy运行上述算法

running the algorithm above with numpy

rsize = 8    # size of region
csize = 32   # size of chunk
np.array([p
 for rp in generate(size=rsize, width=rsize*csize)
 for p in generate(size=csize, width=csize, point=rp)])

背景:

我正在尝试创建体素引擎，并且我想优化算法以将生成时间降低到可管理的水平.虽然我显然无法实现接近Java/C ++的任何功能，但我想尽可能地将python(或pypy)推入.

I am attempting to create a voxel engine, and I want to optimize my algorithm to bring down generation times to a manageable level. While I clearly won't achieve anything close to Java/C++, I'd like to push python (or pypy) as far as I can.

我注意到，从某些早期的时间来看，列表查找要比字典快得多.并且列表也比元组快(出乎意料)，尽管元组生成得更快. Numpy的读取时间比非numpy的读取时间还要快.但是，创建numpy的时间可能要慢几个数量级.

I noticed that list lookups are significantly faster than dictionaries from some earlier timings. And lists are also faster than tuples (unexpected), though tuples generate faster. Numpy has even faster read times than non-numpy. However, numpy creation time can be orders of magnitude slower.

因此，如果阅读是最重要的，那么使用numpy有明显的优势.但是，如果阅读和创作同等重要，那么最好选择一个简单的清单.就是说，我没有一个干净的方法来查看内存使用情况，但是我怀疑列表的存储效率远不及元组或numpy. 另外，虽然差异很小，但我发现字典上的.get比使用__ getitem __调用要快一些(即dictionary [lookup] vs. dicitonary.get(lookup))

So if reading is most important, there is a clear advantage for using a numpy. If reading and creation are of equal importance, however, then a straight list is likely best. That said, I don't have a clean way of looking at memory usage, but I suspect that lists are far less memory efficient than tuples or numpy. Also, while it's a small difference, I found .get on a dictionary to be slightly faster than using the __ getitem __ call (i.e. dictionary[lookup] vs. dicitonary.get(lookup) )

时间...

Python 2.7.5

阅读

 - Option 1: tuple access... 2045.51 ms
 - Option 2: tuple access (again)... 2081.97 ms    # sampling effect of cache
 - Option 3: list access... 2072.09 ms
 - Option 4: dict access... 3436.53 ms
 - Option 5: iterable creation... N/A
 - Option 6: numpy array... 1752.44 ms

创造

 - Option 1: tuple creation... 690.36 ms
 - Option 2: tuple creation (again)... 716.49 ms    # sampling effect of cache
 - Option 3: list creation... 684.28 ms
 - Option 4: dict creation... 1498.94 ms
 - Option 5: iterable creation...  0.01 ms
 - Option 6: numpy creation... 3514.25 ms

Pypy 2.2.1

阅读

 - Option 1: tuple access... 243.34 ms
 - Option 2: tuple access (again)... 246.51 ms    # sampling effect of cache
 - Option 3: list access... 139.65 ms
 - Option 4: dict access... 454.65 ms
 - Option 5: iterable creation... N/A
 - Option 6: numpy array... 21.60 ms

创造

 - Option 1: tuple creation... 1016.27 ms
 - Option 2: tuple creation (again)... 1063.50 ms    # sampling effect of cache
 - Option 3: list creation... 365.98 ms
 - Option 4: dict creation... 2258.44 ms
 - Option 5: iterable creation...  0.00 ms
 - Option 6: numpy creation... 12514.20 ms

在所有示例中，均针对随机数据生成了随机查找.

In all the examples, a random lookup was generated for random data.

dsize = 10 ** 7   # or 10 million data points
data = [(i, random.random()*dsize)
         for i in range(dsize)]
lookup = tuple(int(random.random()*dsize) for i in range(dsize))

循环非常简单:

for x in lookup:
    data_of_specific_type[x]

特定类型的数据 是将数据转换为该类型(例如元组(数据)，列表(数据)等)

And the data_of_specific_type was a conversion of data into that type (e.g. tuple(data), list(data), etc.)

Pypy vs Python中的计数算法性能优化(Numpy vs List) [英] Counting Algorithm Performance Optimization in Pypy vs Python (Numpy vs List)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Pypy vs Python中的计数算法性能优化(Numpy vs List) [英] Counting Algorithm Performance Optimization in Pypy vs Python (Numpy vs List)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭