生成因子变量水平的预测值 [英] Generating predicted values for levels of factor variable

查看:50
本文介绍了生成因子变量水平的预测值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 lm() 对连续结果变量的多个因子变量进行回归.例如,

I am regressing a number of factor variables on a continuous outcome variable using lm(). For example,

fit<-lm(dv~factor(hour)+factor(weekday)+factor(month)+factor(year)+count, data=df)

我想为因子变量的不同级别生成预测值 (yhat),同时将其他变量保持在它们的中值或模态值.例如,如何在保持其他因素不变的情况下为不同的工作日生成 yhat?

I would like to generate predicted values (yhat) for different levels of a factor variable while holding the other variables at their median or modal value. For example, how would I generate the yhat for different weekdays while holding other factors constant?

推荐答案

我或许可以根据 @Roland 的评论提供帮助.我认为您需要简单的旧方差分析,这有助于确定因素是否重要.这里不需要考虑因素,整数或数字(类:数字)工作正常.我把下面的代码放在一起作为例子:

I may be able to assist based on @Roland's comments. I think you want plain old ANOVA, which helps determine if factors are important or not. There's no need to factor here, integers or numbers (class: numeric) work fine. I put together the following code as example:

#creates df
(df <- data.frame(h=c(1,3,4,0,2, 3),d=c(2*1:3), m=c(-1, 0, 3, 4, 7, 8), y=c(30,28,27,26,22, 21)))

#creates linear model, gives output
(fit<-lm(df$d~ df$h + df$m+ df$y))

#runs ANOVA on linear model
anova(fit)

#creates predictions from lm based on different values of df$h
predict.lm(fit)

方差分析是回归的一个特例.输出将告诉您该因子是否通过 P 值显着.

ANOVA is a special case of a regression. The output will tell you whether or not the factor is significant by the P value.

> anova(fit)
Analysis of Variance Table

Response: df$d
          Df  Sum Sq Mean Sq F value  Pr(>F)  
df$h       1 13.2923 13.2923 89.5846 0.01098 *
df$m       1  2.2832  2.2832 15.3879 0.05927 .
df$y       1  0.1277  0.1277  0.8608 0.45147  
Residuals  2  0.2968  0.1484     

在此示例中,小时数与因变量天数高度相关,而月份显示的相关性次之.

In this example hours are very highly correlated with your dependent variable days, while months shows the next highest correlation.

请参阅背景链接-

http://www.cookbook-r.com/Statistical_analysis/ANOVA/

仅供参考 - 我建议您包含一些源代码来创建您的示例.通过这种方式,试图回答您的问题的人都可以参考同一个例子.

FYI - I recommend you include some source code to create your example. In this manner people who attempt to answer your question can all refer to the same example.

仅供参考 - 我建议您添加标签回归"

FYI2 - I recommend you add the tag "regression"

HTH.

这篇关于生成因子变量水平的预测值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆