提高光线追踪命中功能的性能 [英] Improving performance of raytracing hit function

查看：62 发布时间：2021/6/15 19:23:09 python performance

本文介绍了提高光线追踪命中功能的性能的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在 python 中有一个简单的光线追踪器.渲染一个 200x200 的图像需要 4 分钟，这对我来说绝对是太多了.我想改善这种情况.

I have a simple raytracer in python. rendering an image 200x200 takes 4 minutes, which is definitely too much for my taste. I want to improve the situation.

几点:我为每个像素拍摄多条光线(以提供抗锯齿)，每个像素总共有 16 条光线.200x200x16 总共有 640000 条光线.必须测试每条光线对场景中多个球体对象的影响.Ray也是一个相当琐碎的对象

Some points: I shoot multiple rays per each pixel (to provide antialiasing) for a grand total of 16 rays per pixel. 200x200x16 is a grand total of 640000 rays. Each ray must be tested for impact on multiple Sphere objects in the scene. Ray is also a rather trivial object

class Ray(object):
    def __init__(self, origin, direction):
        self.origin = numpy.array(origin)
        self.direction = numpy.array(direction)

Sphere 稍微复杂一些，并带有 hit/nohit 的逻辑:

Sphere is slightly more complex, and carries the logic for hit/nohit:

class Sphere(object):
    def __init__(self, center, radius, color):
        self.center = numpy.array(center)
        self.radius = numpy.array(radius)
        self.color = color

    @profile 
    def hit(self, ray):
        temp = ray.origin - self.center
        a = numpy.dot(ray.direction, ray.direction)
        b = 2.0 * numpy.dot(temp, ray.direction)
        c = numpy.dot(temp, temp) - self.radius * self.radius
        disc = b * b - 4.0 * a * c

        if (disc < 0.0):
            return None
        else:
            e = math.sqrt(disc)
            denom = 2.0 * a
            t = (-b - e) / denom 
            if (t > 1.0e-7):
                normal = (temp + t * ray.direction) / self.radius
                hit_point = ray.origin + t * ray.direction
                return ShadeRecord.ShadeRecord(normal=normal, 
                                               hit_point=hit_point, 
                                               parameter=t, 
                                               color=self.color)           

            t = (-b + e) / denom

            if (t > 1.0e-7):
                normal = (temp + t * ray.direction) / self.radius                hit_point = ray.origin + t * ray.direction
                return ShadeRecord.ShadeRecord(normal=normal, 
                                               hit_point=hit_point, 
                                               parameter=t, 
                                               color=self.color)       

        return None

现在，我运行了一些分析，看起来最长的处理时间是在 hit() 函数中

Now, I ran some profiling, and it appears that the longest processing time is in the hit() function

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  2560000  118.831    0.000  152.701    0.000 raytrace/objects/Sphere.py:12(hit)
  1960020   42.989    0.000   42.989    0.000 {numpy.core.multiarray.array}
        1   34.566   34.566  285.829  285.829 raytrace/World.py:25(render)
  7680000   33.796    0.000   33.796    0.000 {numpy.core._dotblas.dot}
  2560000   11.124    0.000  163.825    0.000 raytrace/World.py:63(f)
   640000   10.132    0.000  189.411    0.000 raytrace/World.py:62(hit_bare_bones_object)
   640023    6.556    0.000  170.388    0.000 {map}

这并不让我感到惊讶，我想尽可能地降低这个值.我传给line profiling，结果是

This does not surprise me, and I want to reduce this value as much as possible. I pass to line profiling, and the result is

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    12                                               @profile
    13                                               def hit(self, ray):
    14   2560000     27956358     10.9     19.2          temp = ray.origin - self.center
    15   2560000     17944912      7.0     12.3          a = numpy.dot(ray.direction, ray.direction)
    16   2560000     24132737      9.4     16.5          b = 2.0 * numpy.dot(temp, ray.direction)
    17   2560000     37113811     14.5     25.4          c = numpy.dot(temp, temp) - self.radius * self.radius
    18   2560000     20808930      8.1     14.3          disc = b * b - 4.0 * a * c
    19                                                   
    20   2560000     10963318      4.3      7.5          if (disc < 0.0):
    21   2539908      5403624      2.1      3.7              return None
    22                                                   else:
    23     20092        75076      3.7      0.1              e = math.sqrt(disc)
    24     20092       104950      5.2      0.1              denom = 2.0 * a
    25     20092       115956      5.8      0.1              t = (-b - e) / denom
    26     20092        83382      4.2      0.1              if (t > 1.0e-7):
    27     20092       525272     26.1      0.4                  normal = (temp + t * ray.direction) / self.radius
    28     20092       333879     16.6      0.2                  hit_point = ray.origin + t * ray.direction
    29     20092       299494     14.9      0.2                  return ShadeRecord.ShadeRecord(normal=normal, hit_point=hit_point, parameter=t, color=self.color)

所以，看起来大部分时间都花在了这段代码上:

So, it appears that most of the time is spent in this chunk of code:

        temp = ray.origin - self.center
        a = numpy.dot(ray.direction, ray.direction)
        b = 2.0 * numpy.dot(temp, ray.direction)
        c = numpy.dot(temp, temp) - self.radius * self.radius
        disc = b * b - 4.0 * a * c

我真的没有看到很多要优化的地方.您知道如何在不使用 C 的情况下使此代码更高效吗?

Where I don't really see a lot to optimize. Do you have any idea how to make this code more performant without going C ?

推荐答案

查看您的代码，看起来您的主要问题是您的代码行被调用了 2560000 次.无论您在该代码中做什么类型的工作，这往往会花费大量时间.但是，使用 numpy，您可以将大量工作聚合到少量 numpy 调用中.

Looking at your code, it looks like your main problem is that you have lines of code that are being called 2560000 times. That will tend to take a lot of time regardless what kind of work you are doing in that code. However, using numpy, you can aggregate alot of this work into a small number of numpy calls.

首先要做的是将光线组合成大阵列.与其使用具有 1x3 原点和方向向量的 Ray 对象，不如使用 Nx3 阵列，该阵列具有命中检测所需的所有光线.您的命中函数的顶部最终将如下所示:

The first thing to do is to combine your rays together into large arrays. Instead of using a Ray object that has 1x3 vectors for origin and direction use Nx3 arrays that have all of the rays you need for the hit detection. The top of your hit function will end up looking like this:

temp = rays.origin - self.center
b = 2.0 * numpy.sum(temp * rays.direction,1)
c = numpy.sum(numpy.square(temp), 1) - self.radius * self.radius
disc = b * b - 4.0 * c

对于下一部分，您可以使用

For the next part you can use

possible_hits = numpy.where(disc >= 0.0)
a = a[possible_hits]
disc = disc[possible_hits]
...

仅继续使用通过判别式测试的值.通过这种方式，您可以轻松获得数量级的性能提升.

to continue with just the values that pass the discriminant test. You can easily get orders of magnitude performance improvements this way.

这篇关于提高光线追踪命中功能的性能的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

提高光线追踪命中功能的性能 [英] Improving performance of raytracing hit function

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

提高光线追踪命中功能的性能 [英] Improving performance of raytracing hit function

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭