从lm提取公式,包括分类变量(R) [英] Extract Formula from lm including Categorical Variables (R)

查看:413
本文介绍了从lm提取公式,包括分类变量(R)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个lm对象,想获取使用系数提取的公式.该对象包括诸如月之类的分类变量,以及与这些分类变量和数字变量的交互.

I have an lm object and want to get the formula extracted with coefficients. This object includes categorical variables like month, as well as interactions with these categorical variables and numeric ones.

另一个用户帮助了一些代码,该代码适用于除分类变量以外的所有变量,但是当我添加分类变量(例如此处的d)时,它崩溃了,并给出错误"parse(text = x)Error::1 :785:意外的数字常量":

Another user helped with some code that works for all but the categorical variables, however when I add a categorical variable (eg. d here) it breaks down and gives the error "Error in parse(text = x) : :1:785: unexpected numeric constant":

a = c(1, 2, 5, 13, 40, 29, 82, 22, 34, 54, 12, 31, 21, 29, 31, 42)
b = c(12, 15, 20, 12, 34, 56, 12, 12, 15, 20, 12, 34, 56, 12, 32, 41)
c = c(20, 30, 40, 18, 72, 34, 12, 40, 18, 72, 28, 65, 21, 32, 42, 52)
d = structure(c(8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 4L, 
8L, 1L, 9L, 7L), .Label = c("April", "August", "December", 
"February", "January", "July", "June", "March", "May", "November", 
"October", "September"), class = "factor")


model = lm(a~b+c+factor(d))

as.formula(
  paste0("y ~ ", round(coefficients(model)[1],2), " + ", 
    paste(sprintf("%.2f * %s", 
                  coefficients(model)[-1],  
                  names(coefficients(model)[-1])), 
          collapse=" + ")
  )
)

我从上面得到的是解析错误(text = x)::1:53:意外的符号 1:y〜-7 + 14.23 * b + -6.82 * c + -529.30 * factor(d)八月

What I get from above is "Error in parse(text = x) : :1:53: unexpected symbol 1: y ~ -7 + 14.23 * b + -6.82 * c + -529.30 * factor(d)August

当我想要得到完整的公式时,每个月乘以一个系数(或者在这种情况下,只有三个月,在我的实际数据集中,我有更多的数据,并且所有月份至少发生了8个次).但是它在这里停滞了,在本例中为意外的符号",而在我的实际数据中为"parse(text = x)::1:785:意外的数值常量错误",甚至没有尝试像这里那样做一个月(不知道为什么示例和实际代码之间会有区别).

When I'd like is to get the full formula, with each of the months multiplied by a coefficient (or in this case only 3 of them, in my actual dataset I have much more data and all months happen at least 8 times). But it stalls here, in this example with 'unexpected symbol' and in my actual data with "Error in parse(text = x) : :1:785: unexpected numeric constant" and without even trying to do a month like it does here (not sure why the difference between the example and actual code).

我的公式很大,因此它需要能够扩展(当前代码可以做到)

My formulas are quite large, so it needs to be able to scale up (which the current code does).

推荐答案

您创建的内容在R中不是有效的formula,因此请不要尝试将sprintf的结果强制转换为公式.

What you are creating is not a valid formula in R, therefore don't try and coerce the results of sprintf into a formula.

因此类似

sprintf(' y ~ %.2f + %s', coef(model)[1], 
   paste(sprintf('(%.2f) * %s',
          coef(model)[-1], names(coef(model)[-1]) ), collapse ='+'))

这篇关于从lm提取公式,包括分类变量(R)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆