将 pnorm 应用于数据框的列 [英] Applying pnorm to columns of a data frame

查看:32
本文介绍了将 pnorm 应用于数据框的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试规范数据框中的一些数据.我想获取每个值并通过 pnorm 函数运行它以及该值所在列的均值和标准差.使用循环,我将如何写出我想要做的事情:

I'm trying to normalize some data which I have in a data frame. I want to take each value and run it through the pnorm function along with the mean and standard deviation of the column the value lives in. Using loops, here's how I would write out what I want to do:

#example data
hist_data <- data.frame( matrix( rnorm( 200,mean=5,sd=.5 ),nrow=20 ) )

n <- dim( hist_data )[2] #columns=10
k <- dim( hist_data )[1] #rows   =20

#set up the data frame which we will populate with a loop
normalized <- data.frame( matrix( nrow = nrow( hist_data ), ncol = ncol( hist_data ) ) )

#hot loop in loop action
for ( i in 1:n ){
   for ( j in 1:k ){
      normalized[j,i] <- pnorm( hist_data[j,i], 
                                mean = mean( hist_data[,i] ), 
                                sd = sd( hist_data[,i] ) )
   }  
}
normalized

似乎在 R 中应该有一个方便的花花公子矢量方式来做到这一点.我以为我很聪明,所以尝试使用 apply 函数:

It seems that in R there should be a handy dandy vector way of doing this. I thought I was smart so tried using the apply function:

#trouble ahead
hist_data <- data.frame( matrix( rnorm( 200, mean = 5,sd = .5 ), nrow=10 ) )
normalized <- apply( hist_data, 2, pnorm, mean = mean( hist_data ), sd = sd( hist_data ) )
normalized

令我懊恼的是,这并没有达到我的预期.输出的左上角和右下角元素是正确的,但仅此而已.那么我该如何去循环我的生活呢?

Much to my chagrin, that does NOT produce what I expected. The upper left and bottom right elements of the output are correct, but that's it. So how can I de-loopify my life?

如果你能告诉我我的第二个代码块实际上在做什么,那就加分.对我来说仍然是个谜.:)

Bonus points if you can tell me what my second code block is actually doing. Kind of a mystery to me still. :)

推荐答案

您想要:

normalize <- apply(hist_data, 2, function(x) pnorm(x, mean=mean(x), sd=sd(x)))

问题是您将单个列传入 pnorm,但整个 hist_data 都传入均值 &标准差.

The problem is that you're passing in the individual column into pnorm, but the entire hist_data into both the mean & the sd.

正如我在 twitter 上提到的,我不是统计数据的人,所以我无法回答关于你实际尝试做的任何事情:)

As I mentioned on twitter, I'm no stats guy so I can't answer anything about what you're actually trying to do :)

这篇关于将 pnorm 应用于数据框的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆