数据集子集回归 [英] Regression on subset of data set

查看:137
本文介绍了数据集子集回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想做以下事情,需要一些帮助:

I'd like to do the following and need some help:

分别为

(A)每个人

(B)性别

并创建一个包含结果(斜率和截距)的表.我可以为此使用申请"吗?

and create a table containing the results (slope and intercept). Can I use "apply" for this?

下一步,我想进行统计测试,以确定性别之间的斜率和截距是否存在显着差异.我知道如何在R中进行测试,但是也许有一种方法可以将斜率/截距计算与T测试结合起来.

In a next step I would like to do a statistical test to determine if slope and intercept are significantly different between Gender. I know how to do the test in R but maybe there is a way to combine slope/intercept calculation and T-testing.

示例数据:

example = data.frame(Age = c(1, 3, 6, 9, 12,
                             1, 3, 6, 9, 12,
                             1, 3, 6, 9, 12,
                             1, 3, 6, 9, 12), 
                Individual = c("Jack", "Jack", "Jack", "Jack", "Jack",
                               "Jill", "Jill", "Jill", "Jill", "Jill",
                               "Tony", "Tony", "Tony", "Tony", "Tony",
                               "Jen", "Jen", "Jen", "Jen","Jen"),
                    Gender = c("M", "M", "M", "M", "M",
                               "F", "F", "F", "F", "F",
                               "M", "M", "M", "M", "M",
                               "F", "F", "F", "F", "F"),
                    Height = c(38, 62, 92, 119, 165,
                               31, 59, 87, 118, 170,
                               45, 72, 93, 155, 171,
                               33, 61, 92, 115, 168))

推荐答案

对每个级别分别进行回归分析,然后在数据框中组合斜率和截距的一种方法是使用库plyr中的函数ddply().

One way to do regression analysis separately for each level and then combine slopes and intercepts in data frame, is to use function ddply() from library plyr.

library(plyr)

ddply(example,"Individual",function(x) coefficients(lm(Height~Age,x)))
  Individual (Intercept)      Age
1       Jack    26.29188 11.11421
2        Jen    22.10660 11.56345
3       Jill    18.33249 12.04315
4       Tony    33.02030 11.96447

ddply(example,"Gender",function(x) coefficients(lm(Height~Age,x)))
  Gender (Intercept)      Age
1      F    20.21954 11.80330
2      M    29.65609 11.53934

这篇关于数据集子集回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆