用内存构造函数的最佳方式 [英] The best way to construct a function with memory

查看:20
本文介绍了用内存构造函数的最佳方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

美好的一天,

我有一些非常缓慢和复杂的函数,比如 f[x,y].我需要构造它的详细ContourPlot.此外,函数 f[x,y] 有时会由于缺乏物理内存而失败.在这种情况下,我必须停止评估并自行调查点 {x,y} 的问题案例.然后我应该可以将元素 {x,y,f[x,y]} 添加到 f[x,y] 的计算值列表(比如缓存")并重新开始 f[x,y] 的计算代码>等高线图.ContourPlot 必须从缓存中获取所有已经计算的 f 值.我更愿意将这样的列表存储在某个文件中,以便以后能够重用它.手动向该文件添加问题点可能更简单.

I have some very slooooow and complicated function, say f[x,y]. And I need to construct detailed ContourPlot of it. Moreover the function f[x,y] sometimes fails due to lack of physical memory. In such cases I have to stop evaluation and investigate the problem case of the point {x,y} by myself. Then I should can add the element {x,y,f[x,y]} to a list of computed values of f[x,y] (say "cache") and restart evaluation of ContourPlot. ContourPlot must take all already computed values of f from the cache. I would prefer to store such list in some file for having ability to reuse it later. And it is probably simpler to add problematic points to this file by hand.

如果 f 的计算值列表可能包含 10000-50000 个点,那么最快的实现方法是什么?

What is the fastest way to implement this if the list of computed values of f may contain 10000-50000 points?

推荐答案

假设我们的慢函数具有签名 f[x, y].

Let's assume our slow function has the signature f[x, y].

纯内存方法

如果您对内存缓存感到满意,最简单的方法就是使用记忆化:

If you are satisfied with an in-memory cache, the simplest thing to do would be to use memoization:

Clear@fmem
fmem[x_, y_] := fmem[x, y] = f[x, y]

每次使用以前从未见过的参数组合调用它时,都会为自身添加一个定义.

This adds a definition to itself every time it is called with a combination of arguments that it has not seen before.

文件支持的内存方法

但是,如果您在长时间计算期间内存不足或遭受内核崩溃,您将希望以某种持久性支持此缓存.最简单的方法是保留一个正在运行的日志文件:

However, if you are running out of memory or suffering kernel crashes during the long computation, you will want to back this cache with some kind of persistence. The simplest thing would be to keep a running log file:

$runningLogFile = "/some/directory/runningLog.txt";

Clear@flog
flog[x_, y_] := flog[x, y] = f[x, y] /.
  v_ :> (PutAppend[Unevaluated[flog[x, y] = v;], $runningLogFile]; v)

If[FileExistsQ[$runningLogFile]
, Get[$runningLogFile]
, Export[$runningLogFile, "", "Text"];
]

flogfmem 相同,除了它还向运行日志中写入一个条目,可用于在以后的会话中恢复缓存的定义.最后一个表达式在找到现有日志文件时重新加载这些定义(如果文件不存在则创建该文件).

flog is the same as fmem, except that it also writes an entry into the running log that can be used to restore the cached definition in a later session. The last expression reloads those definitions when it finds an existing log file (or creates the file if it does not exist).

当需要手动干预时,日志文件的文本性质很方便.请注意,浮点数的文本表示会引入不可避免的舍入错误,因此从日志文件重新加载值后,您可能会得到略有不同的结果.如果这是一个非常重要的问题,您可以考虑使用二进制 DumpSave 功能,尽管我会将这种方法的细节留给读者,因为它不太方便保存增量日志.

The textual nature of the log file is convenient when manual intervention is required. Be aware that the textual representation of floating-point numbers introduces unavoidable round-off errors, so you may get slightly different results after reloading the values from the log file. If this is of great concern, you might consider using the binary DumpSave feature although I will leave the details of that approach to the reader as it is not quite as convenient for keeping an incremental log.

SQL 方法

如果内存非常紧张,并且您想避免使用大型内存缓存来为其他计算腾出空间,则之前的策略可能不合适.在这种情况下,您可以考虑使用 Mathematica 的内置 SQL 数据库将缓存完全存储在外部:

If memory is really tight, and you want to avoid having a large in-memory cache to make room for the other computations, the previous strategy might not be appropriate. In that case, you might consider using Mathematica's built-in SQL database to store the cache completely externally:

fsql[x_, y_] :=
  loadCachedValue[x, y] /. $Failed :> saveCachedValue[x, y, f[x, y]]

我在下面定义了 loadCachedValuesaveCachedValue.基本思想是创建一个 SQL 表,其中每一行都包含一个 x, y, f 三元组.每次需要值时都会查询 SQL 表.请注意,这种方法明显比内存缓存慢,因此当 f 的计算时间比 SQL 访问时间长得多时,它最有意义.SQL 方法不会受到影响文本日志文件方法的舍入错误的影响.

I define loadCachedValue and saveCachedValue below. The basic idea is to create an SQL table where each row holds an x, y, f triple. The SQL table is queried every time a value is needed. Note that this approach is substantially slower than the in-memory cache, so it makes the most sense when the computation of f takes much longer than the SQL access time. The SQL approach does not suffer from the round-off errors that afflicted the text log file approach.

loadCachedValuesaveCachedValue 的定义如下,以及一些其他有用的辅助函数:

The definitions of loadCachedValue and saveCachedValue now follow, along with some other useful helper functions:

Needs["DatabaseLink`"]

$cacheFile = "/some/directory/cache.hsqldb";

openCacheConnection[] :=
  $cache = OpenSQLConnection[JDBC["HSQL(Standalone)", $cacheFile]]

closeCacheConnection[] :=
  CloseSQLConnection[$cache]

createCache[] :=
  SQLExecute[$cache,
    "CREATE TABLE cached_values (x float, y float, f float)
     ALTER TABLE cached_values ADD CONSTRAINT pk_cached_values PRIMARY KEY (x, y)"
  ]

saveCachedValue[x_, y_, value_] :=
  ( SQLExecute[$cache,
      "INSERT INTO cached_values (x, y, f) VALUES (?, ?, ?)", {x, y, value}
    ]
  ; value
  )

loadCachedValue[x_, y_] :=
  SQLExecute[$cache,
    "SELECT f FROM cached_values WHERE x = ? AND y = ?", {x, y}
  ] /. {{{v_}} :> v, {} :> $Failed}

replaceCachedValue[x_, y_, value_] :=
  SQLExecute[$cache,
    "UPDATE cached_values SET f = ? WHERE x = ? AND y = ?", {value, x, y}
  ]

clearCache[] :=
  SQLExecute[$cache,
    "DELETE FROM cached_values"
  ]

showCache[minX_, maxX_, minY_, maxY_] :=
  SQLExecute[$cache,
    "SELECT *
     FROM cached_values
     WHERE x BETWEEN ? AND ?
     AND y BETWEEN ? AND ?
     ORDER BY x, y"
  , {minX, maxX, minY, maxY}
  , "ShowColumnHeadings" -> True
  ] // TableForm

此 SQL 代码使用浮点值作为主键.这在 SQL 中通常是有问题的做法,但在当前上下文中效果很好.

This SQL code uses floating point values as primary keys. This is normally a questionable practice in SQL but works fine in the present context.

在尝试使用任何这些函数之前,您必须调用 openCacheConnection[].您应该在完成后调用 closeCacheConnection[].仅一次,您必须调用 createCache[] 来初始化 SQL 数据库.replaceCachedValueclearCacheshowCache 用于手动干预.

You must call openCacheConnection[] before attempting to use any of these functions. You should call closeCacheConnection[] after you have finished. One time only, you must call createCache[] to initialize the SQL database. replaceCachedValue, clearCache and showCache are provided for manual interventions.

这篇关于用内存构造函数的最佳方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆