使用“sincos”和“sincos”。在Java中 [英] Using "sincos" in Java
问题描述
在很多情况下,我不仅需要正弦,还需要相同参数的余弦。
In a lot of situations I not only need the sine, but also the cosine of the same parameter.
对于C,有 sincos
在常见的unix m
数学库中运行。实际上,至少在i386上,这应该是一个汇编指令, fsincos
。
For C, there is the sincos
function in the common unix m
math library. And actually, at least on i386, this should be a single assembly instruction, fsincos
.
sincos,sincosf,sincosl - 同时计算sin和cos
sincos, sincosf, sincosl - calculate sin and cos simultaneously
我想这些好处是存在的,因为计算中存在明显的重叠正弦和余弦: sin(x)^ 2 + cos(x)^ 2 = 1
。但是,如果 cos = Math.sqrt(1 - sin * sin)
,那么AFAIK就不会为此付出代价,因为 sqrt
函数的成本相似。
I guess these benefits exist because there is an obvious overlap in computing sine and cosine: sin(x)^2 + cos(x)^2 = 1
. But AFAIK it does not pay off to try to shortcut this as cos = Math.sqrt(1 - sin*sin)
, as the sqrt
function comes at a similar cost.
有没有办法在Java中获得相同的好处?我想我要为 double []
付出代价;由于添加了垃圾收集,这可能使得所有的努力都没有实际意义。
Is there any way to reap the same benefits in Java? I guess I'm going to pay a price for a double[]
then; which maybe makes all the efforts moot because of the added garbage collection.
或者Hotspot编译器足够智能以识别我需要两者,并将其编译为 sincos
命令?我可以测试它是否识别它,我可以帮助它识别它,例如通过确保 Math.sin
和 Math.cos
命令在我的代码中直接连续?从Java语言的角度来看,这实际上是最有意义的:让编译器优化它以使用 fsincos
汇编调用。
Or is the Hotspot compiler smart enough to recognize that I need both, and will compile this to a sincos
command? Can I test whether it recognizes it, and can I help it recognizing this, e.g. by making sure the Math.sin
and Math.cos
commands are directly successive in my code? This would actually make the most sense from a Java language point of view: having the comiler optimize this to use the fsincos
assembly call.
从一些汇编程序文档中收集:
Collected from some assembler documentation:
Variations 8087 287 387 486 Pentium
fsin - - 122-771 257-354 16-126 NP
fsincos - - 194-809 292-365 17-137 NP
Additional cycles required if operand > pi/4 (~3.141/4 = ~.785)
sqrt 180-186 180-186 122-129 83-87 70 NP
fsincos
应该需要一个额外的弹出,但这应该是1个时钟周期。假设CPU也没有对此进行优化,那么 sincos
的速度几乎是调用 sin
两倍(第二次)计算余弦;所以我认为它需要做一个补充)。在某些情况下, sqrt
会更快,但正弦可能更快。
fsincos
should need an extra pop, but that should come at 1 clock cycle. Assuming that the CPU also does not optimize this, sincos
should be almost twice as fast as calling sin
twice (second time to compute cosine; so i figure it will need to do an addition). sqrt
could be faster in some situations, but sine can be faster.
更新 :我在C中做了一些实验,但它们没有结果。有趣的是, sincos
似乎甚至比 sin
稍快(没有 cos
),当你计算 sin
和 fsincos
> cos - 所以我做了Hotspot要做的事情(或者Hotspot也做了吗?)。我还不能阻止编译器使用 fsincos
来超越我,除非不使用 cos
。它将回落到C sin
,而不是 fsin
。
Update: I've done some experiments in C, but they are inconclusive. Interestingly enough, sincos
seems to be even slightly faster than sin
(without cos
), and the GCC compiler will use fsincos
when you compute both sin
and cos
- so it does what I'd like Hotspot to do (or does Hotspot, too?). I could not yet prevent the compiler from outsmarting me by using fsincos
except by not using cos
. It will then fall back to a C sin
, not fsin
.
推荐答案
我用卡尺执行了一些微基准测试。在-4 * pi ... 4 * pi范围内的(预先计算的)随机数阵列上进行10000000次迭代。我尽力获得最快的JNI解决方案 - 我很难预测你是否真的会得到 fsincos
或一些模拟的正余弦
。报告的数字是10个卡尺试验中最好的(其中包括3-10个试验,其中报告的平均值)。所以粗略地说,每个内循环运行30-100次。
I have performed some microbenchmarks with caliper. 10000000 iterations over a (precomputed) array of random numbers in the range -4*pi .. 4*pi. I tried my best to get the fastest JNI solution I could come up going - it's a bit hard to predict whether you will actually get fsincos
or some emulated sincos
. Reported numbers are the best of 10 caliper trials (which in turn consist of 3-10 trials, the average of which is reported). So roughly it's 30-100 runs of the inner loop each.
我已经对几种变体进行了基准测试:
I've benchmarked several variants:
-
仅限Math.sin
(参考) -
Math.cos
仅(参考) -
Math.sin
+Math.cos
-
sincos
通过JNI -
Math.sin
+ cos viaMath.sqrt((1 + sin)*(1-sin))
+ sign reconstruction -
Math.cos
+ sin viaMath.sqrt((1 + cos)*(1-cos))
+符号重建
Math.sin
only (reference)Math.cos
only (reference)Math.sin
+Math.cos
sincos
via JNIMath.sin
+ cos viaMath.sqrt( (1+sin) * (1-sin) )
+ sign reconstructionMath.cos
+ sin viaMath.sqrt( (1+cos) * (1-cos) )
+ sign reconstruction
(1 + sin)*(1-sin)= 1-sin * sin
数学上,但如果sin接近1,它应该更精确?运行时间差异很小,您可以节省一次。
(1+sin)*(1-sin)=1-sin*sin
mathematically, but if sin is close to 1 it should be more precise? Runtime difference is minimal, you save one addition.
通过 x%= TWOPI进行签名重建; if(x <0)x + = TWOPI;
然后检查象限。如果你知道如何用更少的CPU做到这一点,我会很高兴听到。
Sign reconstruction via x %= TWOPI; if (x<0) x+=TWOPI;
and then checking the quadrant. If you have an idea how to do this with less CPU, I'd be happy to hear.
通过 sqrt $ c的数字丢失$ c>似乎没问题,至少对于普通角度来说。在粗略实验的1e-10范围内。
Numerical loss via sqrt
seems to be okay, at least for common angles. On the range of 1e-10 from rough experiments.
Sin 1,30 ==============
Cos 1,29 ==============
Sin, Cos 2,52 ============================
JNI sincos 1,77 ===================
SinSqrt 1,49 ================
CosSqrt 1,51 ================
sqrt(1-s * s)
与 sqrt( (1 + s)*(1-s))
差异大约为0.01。如您所见,基于 sqrt
的方法胜过任何其他方法(因为我们目前无法访问 sincos
纯Java)。 JNI sincos
优于计算 sin
和 cos
,但 sqrt
方法仍然更快。 cos
本身似乎始终是一个优于 sin
的刻度(0,01),但重建的情况有区别sign有一个额外的>
测试。我不认为我的结果支持 sin + sqrt
或 cos + sqrt
显然是可取的,但他们确实与 sin
相比,节省约40%的时间,然后 cos
。
The sqrt(1-s*s)
vs. sqrt((1+s)*(1-s))
makes about 0,01 difference. As you can see, the sqrt
based approach wins hands down against any of the others (as we can't currently access sincos
in pure Java). The JNI sincos
is better than computing sin
and cos
, but the sqrt
approach is still faster. cos
itself seems to be consistently a tick (0,01) better than sin
, but the case distinction to reconstruct the sign has an extra >
test. I don't think my results support that either sin+sqrt
or cos+sqrt
is clearly preferrable, but they do save around 40% of the time compared to sin
then cos
.
如果我们将Java扩展为具有内在优化的sincos ,那么这可能会更好。恕我直言,这是一个常见的用例,例如在图形中。当在AWT,Batik等中使用时,许多应用程序都可以从中受益。
If we would extend Java to have an intrinsic optimized sincos, then this would likely be even better. IMHO it is a common use case, e.g. in graphics. When used in AWT, Batik etc. numerous applications could benefit from it.
如果我再次运行它,我还会添加JNI sin
和 noop
来估算JNI的成本。也许还可以通过JNI对 sqrt
技巧进行基准测试。从长远来看,确保我们确实想要一个内在的 sincos
。
If I'd run this again, I would also add JNI sin
and a noop
to estimate the cost of JNI. Maybe also benchmark the sqrt
trick via JNI. Just to make sure that we actually do want an intrinsic sincos
in the long run.
这篇关于使用“sincos”和“sincos”。在Java中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!