根据组缩放所有值 [英] Scale all values depending on group

查看:86
本文介绍了根据组缩放所有值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个与此相似的数据框

I have a dataframe similar to this one

ID <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
p1 <- c(21000, 23400, 26800, 2345, 23464, 34563, 456433, 56543, 34543,3524, 353, 3432, 4542, 6343, 4534 )
p2 <- c(234235, 2342342, 32, 23432, 23423, 2342342, 34, 2343, 23434, 23434, 34, 234, 2343, 34, 5)
my.df <- data.frame(ID, p1, p2)

现在,我想根据其ID缩放p1和p2中的值.因此,并不是像使用tapply()函数那样缩放整个列,而是对ID 1的所有值进行一次缩放,然后对ID 2的所有值进行缩放,等等.对于p2的缩放相同.新的数据框应包含缩放值.

Now I would like to scale the values in p1 and p2 depending on their ID. So not the whole column would be scaled like when using the tapply() function, but rather scaling is done once for all values for ID 1, then for all values for ID 2 etc. Same for scaling of p2. The new dataframe should consist of the scaled values.

我已经尝试过

df_scaled <- ddply(my.df, my.df$ID, scale(my.df$p1))

但收到错误消息

.fun is not a function.

感谢您的帮助!

推荐答案

dplyr使此操作变得容易:

ID <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
p1 <- c(21000, 23400, 26800, 2345, 23464, 34563, 456433, 56543, 34543,3524, 353, 3432, 4542, 6343, 4534 )
p2 <- c(234235, 2342342, 32, 23432, 23423, 2342342, 34, 2343, 23434, 23434, 34, 234, 2343, 34, 5)
my.df <- data.frame(ID, p1, p2)

library(dplyr)
df_scaled <- my.df %>% group_by(ID) %>% mutate(p1 = scale(p1), p2=scale(p2))

请注意,使用scale时,稳定版本的dplyr中存在一个错误;您可能需要更新到开发版本(请参阅评论).

Note that there is a bug in the stable version of dplyr when working with scale; you might need to update to the dev version (see comments).

这篇关于根据组缩放所有值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆