Perl：编程效率，用于计算大量数据的相关系数 [英] Perl: Programming Efficiency when computing correlation coefficients for a large set of data

查看：1052 发布时间：2017/4/2 13:15:08 perl memory performance dataset

本文介绍了Perl：编程效率，用于计算大量数据的相关系数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

编辑：链接应该可以正常工作，对不起麻烦$
我有一个如下所示的文本文件：

Link should work now, sorry for the trouble

I have a text file that looks like this:


Name, Test 1, Test 2, Test 3, Test 4, Test 5
Bob, 86, 83, 86, 80, 23
Alice, 38, 90, 100, 53, 32
Jill, 49, 53, 63, 43, 23.

我正在写一个给定这个文本文件，它将生成一个Pearson相关系数表，看起来像这样，条目（x，y）是人x和人y之间的相关：

I am writing a program that given this text file, it will generate a Pearson's correlation coefficient table that looks like this where the entry (x,y) is the correlation between person x and person y:


Name,Bob,Alice,Jill
Bob, 1, 0.567088412588577, 0.899798494392584
Alice, 0.567088412588577, 1, 0.812425393004088
Jill, 0.899798494392584, 0.812425393004088, 1

我的程序工作，除了我喂的数据集有82列，更重要的是54000行。当我现在运行我的程序，这是非常缓慢，我得到一个内存不足的错误。有没有办法我可以首先删除任何内存不足错误的可能性，也许使程序运行更有效率？代码在这里：代码。

My program works, except that the data set I am feeding it has 82 columns and, more importantly, 54000 rows. When I run my program right now, it is incredibly slow and I get an out of memory error. Is there a way I can first of all, remove any possibility of an out of memory error and maybe make the program run a little more efficiently? The code is here: code.

感谢您的帮助，
杰克

Thanks for your help,
Jack

编辑：如果有人试图做大规模的计算，转换你的数据转换成hdf5格式。这是我最后做的，以解决这个问题。

In case anyone else is trying to do large scale computation, convert your data into hdf5 format. This is what I ended up doing to solve this issue.

Perl：编程效率，用于计算大量数据的相关系数 [英] Perl: Programming Efficiency when computing correlation coefficients for a large set of data

问题描述

推荐答案

相关文章

其他数据库最新文章

热门教程

热门工具

登录关闭

Perl：编程效率，用于计算大量数据的相关系数 [英] Perl: Programming Efficiency when computing correlation coefficients for a large set of data

问题描述

推荐答案

相关文章

其他数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭