如何从自动分化中获得更多性能? [英] How to get more performance out of automatic differentiation?

查看:150
本文介绍了如何从自动分化中获得更多性能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很难优化一个依赖 ad s conjugateGradientDescent 函数的大部分程序工作。



基本上我的代码是一个)



-s 告诉我:生产力相当低:

 生产力总用户的33.6%,已用完总数的33.6%



从我收集的信息可以看出,有两件事可能会带来更高的性能:




  • 拆箱:目前我使用自定义矩阵实现(在 src / Data / SimpleMat.hs 中) 。这是我可以使用矩阵处理 ad 的唯一方法(请参阅:如何在hmatrix上自动区分?)。我的猜测是,通过使用矩阵类型(如 newtype Mat w h a = Mat(Unboxed.Vector a))可以实现更好的性能,这是由于拆箱和融合。我发现一些代码 ad 但是到目前为止,我还没有能够将它们用于 conjugateGradientFunction


  • 矩阵派生:在一封电子邮件中,我目前找不到爱德华提及使用 Forward 实例更好矩阵类型,而不是让矩阵填充 Forward 实例。我有一个微弱的想法如何实现这一点,但还没有弄清楚我是如何按照 ad s类型实现它的。




这可能是一个太宽的问题,无法在SO上回答,所以如果您愿意帮助我,请随时与我联系我在Github上。

解决方案

当前广告的最坏情况 library here。



FWIW-您将无法使用现有的 ad 具有矩阵/向量广告的类/类型。这将是一个相当大的工程努力,请参阅 https://github.com/ekmett/ad/问题/ 2



至于你为什么不能取消装箱: conjugateGradient 需要使用 Kahn 模式或您的功能上的两种正向模式。前者排除了使用未装箱的向量,因为数据类型携带了语法树,并且不能拆箱。出于各种技术原因,我还没有想出如何使它像标准的 Reverse 模式一样使用固定大小的磁带。



我认为这里的正确答案是让我们坐下来弄清楚如何获得正确的矩阵/向量AD并将其集成到包中,但我承认我现在的时间过于单薄给它应有的关注。



如果你有机会在irc.freenode.net上使用#haskell-lens,我很乐意谈论设计。这个空间并提供建议。 Alex Lang也一直在努力研究 ad ,并经常在场并可能有想法。


I am having a hard time optimizing a program that is relying on ads conjugateGradientDescent function for most of it's work.

Basically my code is a translation of an old papers code that is written in Matlab and C. I have not measured it, but that code is running at several iterations per second. Mine is in the order of minutes per iteration ...

The code is available in this repositories:

The code in question can be run by following these commands:

$ cd aer-utils
$ cabal sandbox init
$ cabal sandbox add-source ../aer
$ cabal run learngabors

Using GHCs profiling facilities I have confirmed that the descent is in fact the part that is taking most of the time:

(interactive version here: https://dl.dropboxusercontent.com/u/2359191/learngabors.svg)

-s is telling me that productivity is quite low:

Productivity  33.6% of total user, 33.6% of total elapsed

From what I have gathered there are two things that might lead to higher performance:

  • Unboxing: currently I use a custom matrix implementation (in src/Data/SimpleMat.hs). This was the only way I could get ad to work with matrices (see: How to do automatic differentiation on hmatrix?). My guess is that by using a matrix type like newtype Mat w h a = Mat (Unboxed.Vector a) would achieve better performance due to unboxing and fusion. I found some code that has ad instances for unboxed vectors, but up to now I haven't been able to use these with the conjugateGradientFunction.

  • Matrix derivatives: In an email I just can't find at the moment Edward mentions that it would be better to use Forward instances for matrix types instead of having matrices filled with Forward instances. I have a faint idea how to achieve that, but have yet to figure out how I'd implement it in terms of ads type classes.

This is probably a question that is too wide to be answered on SO, so if you are willing to help me out here, feel free to contact me on Github.

解决方案

You are running into pretty much the worst-case scenario for the current ad library here.

FWIW- You won't be able to use the existing ad classes/types with "matrix/vector ad". It'd be a fairly large engineering effort, see https://github.com/ekmett/ad/issues/2

As for why you can't unbox: conjugateGradient requires the ability to use Kahn mode or two levels of forward mode on your functions. The former precludes it from working with unboxed vectors, as the data types carry syntax trees, and can't be unboxed. For various technical reasons I haven't figured out how to make it work with a fixed sized 'tape' like the standard Reverse mode.

I think the "right" answer here is for us to sit down and figure out how to get matrix/vector AD right and integrated into the package, but I confess I'm timesliced a bit too thinly right now to give it the attention it deserves.

If you get a chance to swing by #haskell-lens on irc.freenode.net I'd happy to talk about designs in this space and offer advice. Alex Lang has also been working on ad a lot and is often present there and may have ideas.

这篇关于如何从自动分化中获得更多性能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆