定义查找速度:性能问题 [英] Definition lookup speed: a performance issue

查看:28
本文介绍了定义查找速度:性能问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下问题.

我需要构建大量的定义(*),例如

I need to build a very large number of definitions(*) such as

f[{1,0,0,0}] = 1
f[{0,1,0,0}] = 2
f[{0,0,1,0}] = 3
f[{0,0,0,1}] = 2
...
f[{2,3,1,2}] = 4
...
f[{n1,n2,n3,n4}] = some integer
...

这只是一个例子.参数列表的长度不必是 4,而是可以是任何长度.我意识到当参数列表的长度增加时,每个值的查找速度会随着指数复杂性而减慢.也许这并不奇怪,因为很明显,原则上 Mathematica 需要存储多少定义存在组合爆炸.

This is just an example. The length of the argument list does not need to be 4 but can be anything. I realized that the lookup for each value slows down with exponential complexity when the length of the argument list increases. Perhaps this is not so strange, since it is clear that in principle there is a combinatorial explosion in how many definitions Mathematica needs to store.

不过,我希望 Mathematica 是智能的,并且值提取应该是恒定的时间复杂度.显然不是.

Though, I have expected Mathematica to be smart and that value extract should be constant time complexity. Apparently it is not.

有什么办法可以加快查找时间?这可能与 Mathematica 内部如何处理符号定义查找有关.它是否对列表进行短语直到找到匹配项?好像是这样.

Is there any way to speed up lookup time? This probably has to do with how Mathematica internally handles symbol definition lookups. Does it phrases the list until it finds the match? It seems that it does so.

高度赞赏所有建议.最诚挚的问候卓然

All suggestions highly appreciated. With best regards Zoran

(*) 我正在开发一个随机模拟软件,该软件生成系统的所有配置,并需要存储每个配置发生的次数.在这个意义上,列表 {n1, n2, ..., nT} 描述了系统的特定配置,表示有 n1 个类型 1 的粒子,n2 个类型 2 的粒子,...,nT 个类型 T 的粒子.可以是指数级的许多这样的配置.

(*) I am working on a stochastic simulation software that generates all configurations of a system and needs to store how many times each configuration occurred. In that sense a list {n1, n2, ..., nT} describes a particular configuration of the system saying that there are n1 particles of type 1, n2 particles of type 2, ..., nT particles of type T. There can be exponentially many such configurations.

推荐答案

您能否详细说明您是如何计算出查找时间是指数级的?

Could you give some detail on how you worked out that lookup time is exponential?

如果它确实是指数级的,也许您可​​以通过在键(配置)上使用 Hash 来加快速度,然后将键值对存储在一个列表中,例如 {{key1,value1},{key2,value2}},按 key 排序,然后使用二进制搜索(应该是日志时间).这应该非常快地编码,但在速度方面不是最佳的.

If it is indeed exponential, perhaps you could speed things up by using Hash on your keys (configurations), then storing key-value pairs in a list like {{key1,value1},{key2,value2}}, kept sorted by key and then using binary search (which should be log time). This should be very quick to code up but not optimum in terms of speed.

如果这还不够快,可以考虑设置一个合适的哈希表实现(我认为这是 f[{0,1,0,1}]=3 方法所做的,没有检查).

If that's not fast enough, one could think about setting up a proper hashtable implementation (which I thought was what the f[{0,1,0,1}]=3 approach did, without having checked).

但是一些放缓的玩具示例将有助于进一步推进......

But some toy example of the slowdown would be useful to proceed further...

我刚试过

test[length_] := Block[{f},
  Do[
   f[RandomInteger[{0, 10}, 100]] = RandomInteger[0, 10];,
   {i, 1, length}
   ];
  f[{0, 0, 0, 0, 1, 7, 0, 3, 7, 8, 0, 4, 5, 8, 0, 8, 6, 7, 7, 0, 1, 6,
      3, 9, 6, 9, 2, 7, 2, 8, 1, 1, 8, 4, 0, 5, 2, 9, 9, 10, 6, 3, 6, 
     8, 10, 0, 7, 1, 2, 8, 4, 4, 9, 5, 1, 10, 4, 1, 1, 3, 0, 3, 6, 5, 
     4, 0, 9, 5, 4, 6, 9, 6, 10, 6, 2, 4, 9, 2, 9, 8, 10, 0, 8, 4, 9, 
     5, 5, 9, 7, 2, 7, 4, 0, 2, 0, 10, 2, 4, 10, 1}] // timeIt
  ]

使用 timeIt 定义为即使是短时间运行也能准确计时,如下所示:

with timeIt defined to accurately time even short runs as follows:

timeIt::usage = "timeIt[expr] gives the time taken to execute expr,
  repeating as many times as necessary to achieve a total time of \
1s";

SetAttributes[timeIt, HoldAll]
timeIt[expr_] := Module[{t = Timing[expr;][[1]], tries = 1},
    While[t < 1.,
    tries *= 2;
    t = Timing[Do[expr, {tries}];][[1]];
    ];
    Return[t/tries]]

然后

out = {#, test[#]} & /@ {10, 100, 1000, 10000, 100000, 100000};
ListLogLogPlot@out

(也适用于较大的运行).所以这里的时间似乎是恒定的.

(also for larger runs). So it seems constant time here.

这篇关于定义查找速度:性能问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆