了解R中的`scale` [英] Understanding `scale` in R

查看:228
本文介绍了了解R中的`scale`的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图理解R提供的scale的定义.我有要与之作热图的数据(mydata),并且存在非常强的正偏斜.我为scale(mydata)log(my data)创建了带有树状图的热图,并且树状图对于两者均不同.为什么?缩放数据而不是对数据进行对数转换是什么意思?如果我要查看树状图来说明数据列之间的关系,哪一种方法更合适?

I'm trying to understand the definition of scale that R provides. I have data (mydata) that I want to make a heat map with, and there is a VERY strong positive skew. I've created a heatmap with a dendrogram for both scale(mydata) and log(my data), and the dendrograms are different for both. Why? What does it mean to scale my data, versus log transform my data? And which would be more appropriate if I want to look at the dendrogram illustrating the relationship between the columns of my data?

谢谢您的帮助!我已经阅读了定义,但它们笼罩着我.

Thank you for any help! I've read the definitions but they are whooping over my head.

推荐答案

log只需获取向量每个元素的对数(默认情况下为基数e).
scale,使用默认设置,将计算整个向量的均值和标准差,然后通过减去均值并除以sd,用这些值缩放"每个元素. (如果使用scale(x, scale=FALSE),它将仅减去均值,而不会除以标准差.)

log simply takes the logarithm (base e, by default) of each element of the vector.
scale, with default settings, will calculate the mean and standard deviation of the entire vector, then "scale" each element by those values by subtracting the mean and dividing by the sd. (If you use scale(x, scale=FALSE), it will only subtract the mean but not divide by the std deviation.)

请注意,这将为您提供相同的值

Note that this will give you the same values

   set.seed(1)
   x <- runif(7)

   # Manually scaling
   (x - mean(x)) / sd(x)

   scale(x)

这篇关于了解R中的`scale`的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆