使用“sincos”和“sincos”。在Java中 [英] Using "sincos" in Java

查看:274
本文介绍了使用“sincos”和“sincos”。在Java中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在很多情况下,我不仅需要正弦,还需要相同参数的余弦。

In a lot of situations I not only need the sine, but also the cosine of the same parameter.

对于C,有 sincos 在常见的unix m 数学库中运行。实际上,至少在i386上,这应该是一个汇编指令, fsincos

For C, there is the sincos function in the common unix m math library. And actually, at least on i386, this should be a single assembly instruction, fsincos.


sincos,sincosf,sincosl - 同时计算sin和cos

sincos, sincosf, sincosl - calculate sin and cos simultaneously

我想这些好处是存在的,因为计算中存在明显的重叠正弦和余弦: sin(x)^ 2 + cos(x)^ 2 = 1 。但是,如果 cos = Math.sqrt(1 - sin * sin),那么AFAIK就不会为此付出代价,因为 sqrt 函数的成本相似。

I guess these benefits exist because there is an obvious overlap in computing sine and cosine: sin(x)^2 + cos(x)^2 = 1. But AFAIK it does not pay off to try to shortcut this as cos = Math.sqrt(1 - sin*sin), as the sqrt function comes at a similar cost.

有没有办法在Java中获得相同的好处?我想我要为 double [] 付出代价;由于添加了垃圾收集,这可能使得所有的努力都没有实际意义。

Is there any way to reap the same benefits in Java? I guess I'm going to pay a price for a double[] then; which maybe makes all the efforts moot because of the added garbage collection.

或者Hotspot编译器足够智能以识别我需要两者,并将其编译为 sincos 命令?我可以测试它是否识别它,我可以帮助它识别它,例如通过确保 Math.sin Math.cos 命令在我的代码中直接连续?从Java语言的角度来看,这实际上是最有意义的:让编译器优化它以使用 fsincos 汇编调用。

Or is the Hotspot compiler smart enough to recognize that I need both, and will compile this to a sincos command? Can I test whether it recognizes it, and can I help it recognizing this, e.g. by making sure the Math.sin and Math.cos commands are directly successive in my code? This would actually make the most sense from a Java language point of view: having the comiler optimize this to use the fsincos assembly call.

从一些汇编程序文档中收集:

Collected from some assembler documentation:

Variations    8087         287        387      486     Pentium
fsin           -            -       122-771  257-354   16-126  NP
fsincos        -            -       194-809  292-365   17-137  NP
 Additional cycles required if operand > pi/4 (~3.141/4 = ~.785)
sqrt        180-186      180-186    122-129   83-87    70      NP

fsincos 应该需要一个额外的弹出,但这应该是1个时钟周期。假设CPU也没有对此进行优化,那么 sincos 的速度几乎是调用 sin 两倍(第二次)计算余弦;所以我认为它需要做一个补充)。在某些情况下, sqrt 会更快,但正弦可能更快。

fsincos should need an extra pop, but that should come at 1 clock cycle. Assuming that the CPU also does not optimize this, sincos should be almost twice as fast as calling sin twice (second time to compute cosine; so i figure it will need to do an addition). sqrt could be faster in some situations, but sine can be faster.

更新 :我在C中做了一些实验,但它们没有结果。有趣的是, sincos 似乎甚至比 sin 稍快(没有 cos ),当你计算 sin fsincos > cos - 所以我做了Hotspot要做的事情(或者Hotspot也做了吗?)。我还不能阻止编译器使用 fsincos 来超越我,除非不使用 cos 。它将回落到C sin ,而不是 fsin

Update: I've done some experiments in C, but they are inconclusive. Interestingly enough, sincos seems to be even slightly faster than sin (without cos), and the GCC compiler will use fsincos when you compute both sin and cos - so it does what I'd like Hotspot to do (or does Hotspot, too?). I could not yet prevent the compiler from outsmarting me by using fsincos except by not using cos. It will then fall back to a C sin, not fsin.

推荐答案

我用卡尺执行了一些微基准测试。在-4 * pi ... 4 * pi范围内的(预先计算的)随机数阵列上进行10000000次迭代。我尽力获得最快的JNI解决方案 - 我很难预测你是否真的会得到 fsincos 或一些模拟的正余弦。报告的数字是10个卡尺试验中最好的(其中包括3-10个试验,其中报告的平均值)。所以粗略地说,每个内循环运行30-100次。

I have performed some microbenchmarks with caliper. 10000000 iterations over a (precomputed) array of random numbers in the range -4*pi .. 4*pi. I tried my best to get the fastest JNI solution I could come up going - it's a bit hard to predict whether you will actually get fsincos or some emulated sincos. Reported numbers are the best of 10 caliper trials (which in turn consist of 3-10 trials, the average of which is reported). So roughly it's 30-100 runs of the inner loop each.

我已经对几种变体进行了基准测试:

I've benchmarked several variants:


  • 仅限Math.sin (参考)

  • Math.cos 仅(参考)

  • Math.sin + Math.cos

  • sincos 通过JNI

  • Math.sin + cos via Math.sqrt((1 + sin)*(1-sin)) + sign reconstruction

  • Math.cos + sin via Math.sqrt((1 + cos)*(1-cos)) +符号重建

  • Math.sin only (reference)
  • Math.cos only (reference)
  • Math.sin + Math.cos
  • sincos via JNI
  • Math.sin + cos via Math.sqrt( (1+sin) * (1-sin) ) + sign reconstruction
  • Math.cos + sin via Math.sqrt( (1+cos) * (1-cos) ) + sign reconstruction

(1 + sin)*(1-sin)= 1-sin * sin 数学上,但如果sin接近1,它应该更精确?运行时间差异很小,您可以节省一次。

(1+sin)*(1-sin)=1-sin*sin mathematically, but if sin is close to 1 it should be more precise? Runtime difference is minimal, you save one addition.

通过 x%= TWOPI进行签名重建; if(x <0)x + = TWOPI; 然后检查象限。如果你知道如何用更少的CPU做到这一点,我会很高兴听到。

Sign reconstruction via x %= TWOPI; if (x<0) x+=TWOPI; and then checking the quadrant. If you have an idea how to do this with less CPU, I'd be happy to hear.

通过 sqrt 似乎没问题,至少对于普通角度来说。在粗略实验的1e-10范围内。

Numerical loss via sqrt seems to be okay, at least for common angles. On the range of 1e-10 from rough experiments.

Sin         1,30 ==============
Cos         1,29 ==============
Sin, Cos    2,52 ============================
JNI sincos  1,77 ===================
SinSqrt     1,49 ================
CosSqrt     1,51 ================

sqrt(1-s * s) sqrt( (1 + s)*(1-s))差异大约为0.01。如您所见,基于 sqrt 的方法胜过任何其他方法(因为我们目前无法访问 sincos 纯Java)。 JNI sincos 优于计算 sin cos ,但 sqrt 方法仍然更快。 cos 本身似乎始终是一个优于 sin 的刻度(0,01),但重建的情况有区别sign有一个额外的> 测试。我不认为我的结果支持 sin + sqrt cos + sqrt 显然是可取的,但他们确实与 sin 相比,节省约40%的时间,然后 cos

The sqrt(1-s*s) vs. sqrt((1+s)*(1-s)) makes about 0,01 difference. As you can see, the sqrt based approach wins hands down against any of the others (as we can't currently access sincos in pure Java). The JNI sincos is better than computing sin and cos, but the sqrt approach is still faster. cos itself seems to be consistently a tick (0,01) better than sin, but the case distinction to reconstruct the sign has an extra > test. I don't think my results support that either sin+sqrt or cos+sqrt is clearly preferrable, but they do save around 40% of the time compared to sin then cos.

如果我们将Java扩展为具有内在优化的sincos ,那么这可能会更好。恕我直言,这是一个常见的用例,例如在图形中。当在AWT,Batik等中使用时,许多应用程序都可以从中受益。

If we would extend Java to have an intrinsic optimized sincos, then this would likely be even better. IMHO it is a common use case, e.g. in graphics. When used in AWT, Batik etc. numerous applications could benefit from it.

如果我再次运行它,我还会添加JNI sin noop 来估算JNI的成本。也许还可以通过JNI对 sqrt 技巧进行基准测试。从长远来看,确保我们确实想要一个内在的 sincos

If I'd run this again, I would also add JNI sin and a noop to estimate the cost of JNI. Maybe also benchmark the sqrt trick via JNI. Just to make sure that we actually do want an intrinsic sincos in the long run.

这篇关于使用“sincos”和“sincos”。在Java中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆