数据集子集回归 [英] Regression on subset of data set
问题描述
我想做以下事情,需要一些帮助:
I'd like to do the following and need some help:
分别为
(A)每个人
(B)性别
并创建一个包含结果(斜率和截距)的表.我可以为此使用申请"吗?
and create a table containing the results (slope and intercept). Can I use "apply" for this?
下一步,我想进行统计测试,以确定性别之间的斜率和截距是否存在显着差异.我知道如何在R中进行测试,但是也许有一种方法可以将斜率/截距计算与T测试结合起来.
In a next step I would like to do a statistical test to determine if slope and intercept are significantly different between Gender. I know how to do the test in R but maybe there is a way to combine slope/intercept calculation and T-testing.
示例数据:
example = data.frame(Age = c(1, 3, 6, 9, 12,
1, 3, 6, 9, 12,
1, 3, 6, 9, 12,
1, 3, 6, 9, 12),
Individual = c("Jack", "Jack", "Jack", "Jack", "Jack",
"Jill", "Jill", "Jill", "Jill", "Jill",
"Tony", "Tony", "Tony", "Tony", "Tony",
"Jen", "Jen", "Jen", "Jen","Jen"),
Gender = c("M", "M", "M", "M", "M",
"F", "F", "F", "F", "F",
"M", "M", "M", "M", "M",
"F", "F", "F", "F", "F"),
Height = c(38, 62, 92, 119, 165,
31, 59, 87, 118, 170,
45, 72, 93, 155, 171,
33, 61, 92, 115, 168))
推荐答案
对每个级别分别进行回归分析,然后在数据框中组合斜率和截距的一种方法是使用库plyr
中的函数ddply()
.
One way to do regression analysis separately for each level and then combine slopes and intercepts in data frame, is to use function ddply()
from library plyr
.
library(plyr)
ddply(example,"Individual",function(x) coefficients(lm(Height~Age,x)))
Individual (Intercept) Age
1 Jack 26.29188 11.11421
2 Jen 22.10660 11.56345
3 Jill 18.33249 12.04315
4 Tony 33.02030 11.96447
ddply(example,"Gender",function(x) coefficients(lm(Height~Age,x)))
Gender (Intercept) Age
1 F 20.21954 11.80330
2 M 29.65609 11.53934
这篇关于数据集子集回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!