如何获得恒定记忆中的统计数据 [英] How to get make stats in constant memory

查看:212
本文介绍了如何获得恒定记忆中的统计数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个函数,它创建了一些随机的数值结果。我知道,结果将是一个(小,a - b约50)范围内的整数 a,b 。我想创建一个执行上述函数的函数,比如说1000000次,并计算每个结果出现的频率。 (该函数需要一个随机生成器来生成结果。)问题是,我不知道如何在常量内存中执行此操作,而无需对范围的长度进行硬编码。我的(坏)方法是这样的:

  values :: [Int] 
values = doFunctionNtimes myRandom 1000000
results = map(\ x - > length。filter(x ==)$ values)[a..b]

任何人都有这样的想法吗?

编辑:



我想我解释了错误的问题,对此抱歉。我有一个函数,它取决于一个随机gen,给出一个小的int值。为了统计,我想知道,结果出现的频率。因为我想让统计数据超过1000000次尝试,所以我需要不断尝试多次内存。

导入限定的Data.Map为Map
导入Data.List(foldl') - '(修复SO语法突出显示)

histogram ::(Ord a) => [a] - > Map.Map a Int
histogram = foldl'(\mx - > Map.insertWith'(+)x 1 m)Map.empty

关于为什么这样做的原因以及为什么它比Travis Brown的解决方案更优越的解释是相当技术性的,并且需要一定的耐心才能充分理解。



如果列表中可能只有有限的值,那么它将运行在常量内存中。特拉维斯的解决方案有一个微妙的错误,其中最终的地图条目看起来像:

 (4,1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1)

数字19的非常低效的表示。只有当您在地图中要求该元素时,才会计算巨大的总和。这些thunks(延迟评估表达式)将随输入大小线性增长。为避免这种情况,我们使用 insertWith' / code>,它严格地应用 ,也就是说它在将结果放入地图之前对结果进行评估。因此,如果你在上图中插入4,它将评估thunk,你会得到一个很好的整洁:

 (4 ,20)

另一个评估是在添加之前,您将得到:

 (4,21)



<现在至少地图的值是恒定的空间。

我们需要做的最后一件事是将右侧折叠更改为左侧折叠,因为Map。插入在第二个参数中是严格的。以下演示了正确折叠的含义。

  iw xm = Map.insertWith'(+)x 1 m  - ' 

foldr iw Map.empty [1,2,1,3,2,1]
= iw 1(iw 2(iw 1(iw 3(iw 2(iw 1 Map。空)))))

使用 iw 作为一个简单的速记。 Map.insert 在第二个参数中严格意味着您需要在插入可以完成任何工作之前评估插入的映射。我将使用符号 {k1 - > v1,k2 - > v2,...} 作为地图的简写。您的评估序列如下所示:

  foldr fz [] = z 
foldr fz(x:xs)= fx(foldr fz xs)

foldr iw {} [1,2,1,3,2,1]
iw 1(foldr iw {} [2,1,3,2 ,1])$ ​​b $ b iw 1(iw 2(foldr iw {} [1,3,2,1]))
iw 1(iw 2(iw 1(foldr iw {} [3,2 ,1])))
iw 1(iw 2(iw 1(iw 3(foldr iw {} [2,1]))))
iw 1(iw 2(iw 1(iw 3 (iw 2(foldr iw {} [1])))))
iw 1(iw 2(iw 2(iw 1(foldr iw {} [])))))) (iw 2(iw 2(iw 1 {})))))
iw 1(iw 2(iw 2(iw 1(iw 2 (iw 2(iw 2(iw 1(iw 2)(iw 1(iw 2 {iw 1(iw 1)))))bw bw iw 1 1 {1→1→2→1,3→1}))
iw 1(iw 2 {1→2→2→1,3→1} )
iw 1 {1 - > 2,2 - > 2,3 - > 1}
{1 - > 3,2 - > 2,3 - > 1}

因此,如果您有一个1,000,000个元素的数组,我们必须一路走向第1,000,000个元素开始插入,因此我们需要将先前的999,999个元素保留在内存中,这样我们才能知道下一步该怎么做。左折叠解决了这个问题:

   -  left fold的定义
foldl'fz xs = go z xs - '
where
go accum [] = z
go accum(x:xs)= accum`seq` go(f accum x)xs

foldl'( flip iw)Map.empty [1,2,1,3,2,1] - 需要翻转arg命令来安抚foldl'
go {} [1,2,1,3,2,1]
go(iw 1 {})[2,1,3,2,1]
go(iw 2 {1 - > 1})[1,3,2,1] $ b $ (1→2→2→1→1→2→1→1→2→1→1→2→1→3→1→2→1→3) ,1]
go(iw 2 {1 - > 2,2-> 1,3 - > 1})[1]
go(iw 1 {1 - > 2,2 - > 2,3 - > 1})[]
iw 1 {1 - > 2,2 - > 2,3 - > 1}
{1 - > 3,2 - > 2,3 - > 1}

现在我们可以看到,最后,如果地图中的条目数有界,那么它会以恒定的空间和线性时间运行。


I have a function, which creates some random numerical results. I know, that the result will be an integer in a (small, a - b approx 50) range a, b. I want to create a function which execute the above function let's say 1000000 times and calculates, how often the each result appears. (The function takes a random generator to produce the result.) The problem is, I don't know how to do this in constant memory without hard-coding the range's length. My (bad) approach is like this:

values :: [Int]
values = doFunctionNtimes myRandom 1000000
results = map (\x ->length . filter (x==) $ values) [a..b]

Anybody an idea to do this?

Edit:

I think I explained the problem wrong, sorry for this. I have a function, which - depending on a random gen - gives out some small int value. To make a statistic, I want to know, how often the results appear. As I want to make stats over let's say 1000000 tries, I need constant memory over the number of tries.

解决方案

import qualified Data.Map as Map
import Data.List (foldl')          -- ' (to fix SO syntax highlighting)

histogram :: (Ord a) => [a] -> Map.Map a Int
histogram = foldl' (\m x -> Map.insertWith' (+) x 1 m) Map.empty

The explanation for why this works and why it is superior to Travis Brown's solution is pretty technical, and will require some patience to understand fully.

If there are only finitely many values that can possibly occur in the list, then this runs in constant memory. Travis's solution has a subtle bug in which the resulting map entries will look like:

(4, 1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1)

A very inefficient representation of the number 19. Only when you ask for that element in the map will the giant sum be computed. These "thunks" (delayed-evaluation expressions) will grow linearly with the size of the input.

To prevent this, we use insertWith', which applies the function strictly, that is to say it evaluates the result before it puts it in the map. So then if you insert 4 into the map above, it will evaluate the thunk and you will get a nice tidy:

(4, 20)

And another will evaluate that before adding so you will get:

(4, 21)

So now at least the values of the map are constant space.

The final thing we need to do is to change the right fold to a left fold because Map.insert is strict in its second argument. The following demonstrates the meaning of a right fold.

iw x m = Map.insertWith' (+) x 1 m    -- '

foldr iw Map.empty [1,2,1,3,2,1]
    = iw 1 (iw 2 (iw 1 (iw 3 (iw 2 (iw 1 Map.empty)))))

Using iw as a simple shorthand. Map.insert being strict in its second argument means you need to evaluate the map into which you are inserting before insert can do any work. I will use the notation { k1 -> v1, k2 -> v2, ... } as a shorthand for maps. Your sequence of evaluation looks like this:

foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)

foldr iw {} [1,2,1,3,2,1]
iw 1 (foldr iw {} [2,1,3,2,1])
iw 1 (iw 2 (foldr iw {} [1,3,2,1]))
iw 1 (iw 2 (iw 1 (foldr iw {} [3,2,1])))
iw 1 (iw 2 (iw 1 (iw 3 (foldr iw {} [2,1]))))
iw 1 (iw 2 (iw 1 (iw 3 (iw 2 (foldr iw {} [1])))))
iw 1 (iw 2 (iw 1 (iw 3 (iw 2 (iw 1 (foldr iw {} []))))))
iw 1 (iw 2 (iw 1 (iw 3 (iw 2 (iw 1 {}))))))
iw 1 (iw 2 (iw 1 (iw 3 (iw 2 {1 -> 1}))))
iw 1 (iw 2 (iw 1 (iw 3 {1 -> 1, 2 -> 1})))
iw 1 (iw 2 (iw 1 {1 -> 1, 2 -> 1, 3 -> 1}))
iw 1 (iw 2 {1 -> 2, 2 -> 1, 3 -> 1})
iw 1 {1 -> 2, 2 -> 2, 3 -> 1}
{1 -> 3, 2 -> 2, 3 -> 1}

So if you have a 1,000,000 element array, we have to go all the way down to the 1,000,000th element to start inserting, thus we need to keep the previous 999,999 elements in memory so we can know what to do next. A left fold solves this:

-- definition of left fold
foldl' f z xs = go z xs             -- '
    where 
    go accum [] = z
    go accum (x:xs) = accum `seq` go (f accum x) xs

foldl' (flip iw) Map.empty [1,2,1,3,2,1]  -- needed to flip arg order to appease foldl'
go {} [1,2,1,3,2,1]
go (iw 1 {}) [2,1,3,2,1]
go (iw 2 {1 -> 1}) [1,3,2,1]
go (iw 1 {1 -> 1, 2 -> 1}) [3,2,1]
go (iw 3 {1 -> 2, 2 -> 1}) [2,1]
go (iw 2 {1 -> 2, 2 -> 1, 3 -> 1}) [1]
go (iw 1 {1 -> 2, 2 -> 2, 3 -> 1}) []
iw 1 {1 -> 2, 2 -> 2, 3 -> 1}
{1 -> 3, 2 -> 2, 3 -> 1}

Now we can see that, finally, if the number of entries in the map is bounded, then this runs in constant space and linear time.

这篇关于如何获得恒定记忆中的统计数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆