提高光线追踪命中功能的性能 [英] Improving performance of raytracing hit function
问题描述
我在 python 中有一个简单的光线追踪器.渲染一个 200x200 的图像需要 4 分钟,这对我来说绝对是太多了.我想改善这种情况.
I have a simple raytracer in python. rendering an image 200x200 takes 4 minutes, which is definitely too much for my taste. I want to improve the situation.
几点:我为每个像素拍摄多条光线(以提供抗锯齿),每个像素总共有 16 条光线.200x200x16 总共有 640000 条光线.必须测试每条光线对场景中多个球体对象的影响.Ray也是一个相当琐碎的对象
Some points: I shoot multiple rays per each pixel (to provide antialiasing) for a grand total of 16 rays per pixel. 200x200x16 is a grand total of 640000 rays. Each ray must be tested for impact on multiple Sphere objects in the scene. Ray is also a rather trivial object
class Ray(object):
def __init__(self, origin, direction):
self.origin = numpy.array(origin)
self.direction = numpy.array(direction)
Sphere 稍微复杂一些,并带有 hit/nohit 的逻辑:
Sphere is slightly more complex, and carries the logic for hit/nohit:
class Sphere(object):
def __init__(self, center, radius, color):
self.center = numpy.array(center)
self.radius = numpy.array(radius)
self.color = color
@profile
def hit(self, ray):
temp = ray.origin - self.center
a = numpy.dot(ray.direction, ray.direction)
b = 2.0 * numpy.dot(temp, ray.direction)
c = numpy.dot(temp, temp) - self.radius * self.radius
disc = b * b - 4.0 * a * c
if (disc < 0.0):
return None
else:
e = math.sqrt(disc)
denom = 2.0 * a
t = (-b - e) / denom
if (t > 1.0e-7):
normal = (temp + t * ray.direction) / self.radius
hit_point = ray.origin + t * ray.direction
return ShadeRecord.ShadeRecord(normal=normal,
hit_point=hit_point,
parameter=t,
color=self.color)
t = (-b + e) / denom
if (t > 1.0e-7):
normal = (temp + t * ray.direction) / self.radius hit_point = ray.origin + t * ray.direction
return ShadeRecord.ShadeRecord(normal=normal,
hit_point=hit_point,
parameter=t,
color=self.color)
return None
现在,我运行了一些分析,看起来最长的处理时间是在 hit() 函数中
Now, I ran some profiling, and it appears that the longest processing time is in the hit() function
ncalls tottime percall cumtime percall filename:lineno(function)
2560000 118.831 0.000 152.701 0.000 raytrace/objects/Sphere.py:12(hit)
1960020 42.989 0.000 42.989 0.000 {numpy.core.multiarray.array}
1 34.566 34.566 285.829 285.829 raytrace/World.py:25(render)
7680000 33.796 0.000 33.796 0.000 {numpy.core._dotblas.dot}
2560000 11.124 0.000 163.825 0.000 raytrace/World.py:63(f)
640000 10.132 0.000 189.411 0.000 raytrace/World.py:62(hit_bare_bones_object)
640023 6.556 0.000 170.388 0.000 {map}
这并不让我感到惊讶,我想尽可能地降低这个值.我传给line profiling,结果是
This does not surprise me, and I want to reduce this value as much as possible. I pass to line profiling, and the result is
Line # Hits Time Per Hit % Time Line Contents
==============================================================
12 @profile
13 def hit(self, ray):
14 2560000 27956358 10.9 19.2 temp = ray.origin - self.center
15 2560000 17944912 7.0 12.3 a = numpy.dot(ray.direction, ray.direction)
16 2560000 24132737 9.4 16.5 b = 2.0 * numpy.dot(temp, ray.direction)
17 2560000 37113811 14.5 25.4 c = numpy.dot(temp, temp) - self.radius * self.radius
18 2560000 20808930 8.1 14.3 disc = b * b - 4.0 * a * c
19
20 2560000 10963318 4.3 7.5 if (disc < 0.0):
21 2539908 5403624 2.1 3.7 return None
22 else:
23 20092 75076 3.7 0.1 e = math.sqrt(disc)
24 20092 104950 5.2 0.1 denom = 2.0 * a
25 20092 115956 5.8 0.1 t = (-b - e) / denom
26 20092 83382 4.2 0.1 if (t > 1.0e-7):
27 20092 525272 26.1 0.4 normal = (temp + t * ray.direction) / self.radius
28 20092 333879 16.6 0.2 hit_point = ray.origin + t * ray.direction
29 20092 299494 14.9 0.2 return ShadeRecord.ShadeRecord(normal=normal, hit_point=hit_point, parameter=t, color=self.color)
所以,看起来大部分时间都花在了这段代码上:
So, it appears that most of the time is spent in this chunk of code:
temp = ray.origin - self.center
a = numpy.dot(ray.direction, ray.direction)
b = 2.0 * numpy.dot(temp, ray.direction)
c = numpy.dot(temp, temp) - self.radius * self.radius
disc = b * b - 4.0 * a * c
我真的没有看到很多要优化的地方.您知道如何在不使用 C 的情况下使此代码更高效吗?
Where I don't really see a lot to optimize. Do you have any idea how to make this code more performant without going C ?
推荐答案
查看您的代码,看起来您的主要问题是您的代码行被调用了 2560000 次.无论您在该代码中做什么类型的工作,这往往会花费大量时间.但是,使用 numpy,您可以将大量工作聚合到少量 numpy 调用中.
Looking at your code, it looks like your main problem is that you have lines of code that are being called 2560000 times. That will tend to take a lot of time regardless what kind of work you are doing in that code. However, using numpy, you can aggregate alot of this work into a small number of numpy calls.
首先要做的是将光线组合成大阵列.与其使用具有 1x3 原点和方向向量的 Ray 对象,不如使用 Nx3 阵列,该阵列具有命中检测所需的所有光线.您的命中函数的顶部最终将如下所示:
The first thing to do is to combine your rays together into large arrays. Instead of using a Ray object that has 1x3 vectors for origin and direction use Nx3 arrays that have all of the rays you need for the hit detection. The top of your hit function will end up looking like this:
temp = rays.origin - self.center
b = 2.0 * numpy.sum(temp * rays.direction,1)
c = numpy.sum(numpy.square(temp), 1) - self.radius * self.radius
disc = b * b - 4.0 * c
对于下一部分,您可以使用
For the next part you can use
possible_hits = numpy.where(disc >= 0.0)
a = a[possible_hits]
disc = disc[possible_hits]
...
仅继续使用通过判别式测试的值.通过这种方式,您可以轻松获得数量级的性能提升.
to continue with just the values that pass the discriminant test. You can easily get orders of magnitude performance improvements this way.
这篇关于提高光线追踪命中功能的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!