如何为每个行业年度(不包括R中的公司i观察值)运行线性回归模型? [英] How to run linear regression model for each industry-year excluding firm i observations in R?

查看:97
本文介绍了如何为每个行业年度(不包括R中的公司i观察值)运行线性回归模型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我在R中的数据集的dput输出...

data1<-structure(list(Year = c(1998, 1999, 1999, 2000, 1996, 2001, 1998, 
1999, 2002, 1998, 2005, 1998, 1999, 1998, 1997, 1998, 2000), 
    `Firm name` = c("A", "A", "B", "B", "C", "C", "D", "D", "D", 
    "E", "E", "F", "F", "G", "G", "H", "H"), Industry = c("AUTO", 
    "AUTO", "AUTO", "AUTO", "AUTO", "AUTO", "AUTO", "AUTO", "AUTO", 
    "Pharma", "Pharma", "Pharma", "Pharma", "Pharma", "Pharma", 
    "Pharma", "Pharma"), X = c(1, 2, 5, 6, 7, 9, 10, 11, 12, 
    13, 15, 16, 17, 18, 19, 20, 21), Y = c(30, 31, 34, 35, 36, 
    38, 39, 40, 41, 42, 44, 45, 46, 47, 48, 49, 50), Z = c(23, 
    29, 47, 53, 59, 71, 77, 83, 89, 95, 107, 113, 119, 125, 131, 
    137, 143)), row.names = c(NA, -17L), class = c("tbl_df", 
"tbl", "data.frame"), na.action = structure(c(`1` = 1L), class = "omit"))
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50), Z = c(23, 
29, 35, 41, 47, 53, 59, 65, 71, 77, 83, 89, 95, 101, 107, 113, 
119, 125, 131, 137, 143)), row.names = c(NA, -21L), class = c("tbl_df", 
"tbl", "data.frame"), na.action = structure(c(`1` = 1L), class = "omit"))

在这里,我尝试对每个行业年度的Y~ X+Z进行回归,但不包括公司i的观察值.对于每个公司,我想使用所有行业同行公司的观察值但不包括公司自己的观察值来估计线性回归模型例如,对于公司A,我想通过使用跨行业的所有同行企业(B,C和D)的所有观察值(但不包括公司A的观察值)对Y〜X + Z进行回归.同样,我想通过使用公司A,C和A的所有观察结果来为公司B运行模型. D(与B属于同一行业的一部分)跨越时间,但不包括B公司的观察结果.对于公司C& D也一样.我想对每个行业的每个公司都进行此练习.请帮忙.

Here I am trying to regress Y~ X+Z for each industry year but excluding firm i observations.For each firm I want to estimate the linear regression model using all industry peer firms' observations but excluding firm's own observations.For example;for firm A, I want to regress Y~ X+Z by using all observations of its industry peer firms (B,C & D) across time but excluding firm A observations. Similarly I want to run model for firm B by using all observations of firm A,C & D (part of same industry as B) across time excluding firm B observations. And same procedure for firm C & D as well. I want to do this exercise for every firm within each industry. Please help.

推荐答案

如@bonedi所述,您可以使用嵌套循环来完成此操作.如果要为单个行业年份组合创建模型,则需要按IndustryYear子集数据.您可以在创建模型之前遍历Firm name并排除该公司.结果可以存储在以行业年份公司命名的列表中.这不是一个很好的解决方案,但它应该使您更接近.

As mentioned by @bonedi you can use a nested loop to accomplish this. If you want to create models for individual industry-year combinations, you will need to subset your data by Industry and Year. You can loop over Firm name and exclude that firm before creating the model. Results can be stored in a list, named by industry-year-firm. It's not a pretty solution but it should get you closer.

lst <- list()

for (ind in unique(data1$Industry)) {
  for (year in unique(data1[data1$Industry == ind, ]$Year)) {
    for (firm in unique(data1[data1$Industry == ind & data1$Year == year, ]$`Firm name`)) {
      sub_data <- data1[data1$Industry == ind & data1$Year == year & data1$`Firm name` != firm, ]
      if (nrow(sub_data) > 0) {
        name <- paste(ind, year, firm, sep = '-')
        lst[[name]] <- lm(Y ~ X + Z, data = sub_data)
      }
    }
  }
}

这篇关于如何为每个行业年度(不包括R中的公司i观察值)运行线性回归模型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆