使用C ++或Scilab或Octave或R中的大量数据进行统计 [英] statistics wiht large amount of data in C++ or Scilab or Octave or R

查看：318 发布时间：2020/5/19 19:45:25 c++ r statistics octave scilab

本文介绍了使用C ++或Scilab或Octave或R中的大量数据进行统计的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我最近需要计算大量(大约8亿)双打的均值和标准差.考虑到一个双精度型占用8个字节，如果将所有双精度型读取到ram中，则大约需要6 GB.我想我可以对C ++或其他高级语言使用分而治之的方法，但这似乎很乏味.有没有办法我可以同时使用R，Scilab或Octave之类的高级语言来完成所有这些工作?谢谢.

I recently need to calculate the mean and standard deviation of a large number (about 800,000,000) of doubles. Considering that a double takes 8 bytes, if all the doubles are read into ram, it will take about 6 GB. I think I can use a divide and conquer approach with C++ or other high level languages, but that seems tedious. Is there a way that I can do this all at once with high level languages like R, Scilab or Octave? Thanks.

推荐答案

不是声称这是最佳选择，但是在python(带有numpy和numexpr模块)中，以下操作很容易(在8G RAM机器上):

Not claiming that this is optimal, but in python (with numpy and numexpr modules) the following is easy (on 8G RAM machine):

import numpy, numpy as np, numexpr
x = np.random.uniform(0, 1, size=8e8)

print x.mean(), (numexpr.evaluate('sum(x*x)')/len(x)-
                (numexpr.evaluate('sum(x)')/len(x))**2)**.5
>>> 0.499991593345 0.288682001731

这不会比原始数组消耗更多的内存.

This doesn't consume more memory than the original array.

这篇关于使用C ++或Scilab或Octave或R中的大量数据进行统计的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用C ++或Scilab或Octave或R中的大量数据进行统计 [英] statistics wiht large amount of data in C++ or Scilab or Octave or R

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

使用C ++或Scilab或Octave或R中的大量数据进行统计 [英] statistics wiht large amount of data in C++ or Scilab or Octave or R

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭