因子与自变量乘积的线性回归 [英] Linear regression with product of factor and independent variable

查看:209
本文介绍了因子与自变量乘积的线性回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试估算需求模型:

I am try to estimate a demand model:

d_t^k = a_t - b^k p_t^k + e_t^k

索引t代表星期编号,k代表产品编号.每个产品d_t^k的需求取决于所有产品a_t共有的一般季节性,并且是该周产品价格p_t^k的仿射函数,加上一些正常的随机误差e_t^k

The indices t are for week number, k are for product number. The demand for each product d_t^k depends on the general seasonality that is shared by all the products a_t, and is a affine function of the price of the product in that week p_t^k, plus some normal random error e_t^k.

但是,如果我要使用以下lm函数调用,则当我想要的是price^k的每个产品b^k的一个系数时,它会给我price的单个系数b.

However, if I use the following lm function call, it gives me a single coefficient b for price, when what I want is one coefficient per product b^k for price^k.

lm(demand ~ factor(week) + price, data = df)

表达模型的正确方法是什么?

What is the right way to express the model?

lm(demand ~ factor(week) + factor(product) * price, data = df)

我想上面的方法是可行的,但是它找不到任何文档可以告诉我发生了什么事.

I am guessing that the above would work, and it but I can't find any documentation that tells me what is going on there.

作为一个具体的示例,我运行以下代码,在稍微不同的需求模型上运行 d_t ^ k = a_t + a ^ k-b ^ k p_t ^ k + e_t ^ k

As a concrete example, I have the following code that runs, on a slightly different demand model d_t^k = a_t + a^k - b^k p_t^k + e_t^k

# Generate fake prices and sales, and estimate the coefficients of
# the demand model.

number.of.items <- 20 # Must be a multiple of 4
number.of.weeks <- 5
coeff.item.min <- 300
coeff.item.max <- 500
coeff.price.min <- 1.4
coeff.price.max <- 2
normal.sd <- 40
set.seed(200)

# Generate random coefficients for the items
coeff.item <- runif(number.of.items, coeff.item.min, coeff.item.max)
coeff.price <- runif(number.of.items, coeff.price.min, coeff.price.max)
coeff.week <- 50 * 1:number.of.weeks

# Row is item, column is week
week.id.matrix <- outer(rep(1, number.of.items), 1:number.of.weeks)
item.id.matrix <- outer(1:number.of.items, rep(1, number.of.weeks))
price.matrix <- rbind(
  outer(rep(1, number.of.items / 4), c(100, 100, 90, 90, 80)),
  outer(rep(1, number.of.items / 4), c(100, 90, 90, 80, 60)),
  outer(rep(1, number.of.items / 4), c(100, 85, 85, 60, 60)),
  outer(rep(1, number.of.items / 4), c(100, 75, 60, 45, 45))
)
coeff.week.matrix <- outer(rep(1, number.of.items), coeff.week)
coeff.price.matrix <- outer(coeff.price, rep(1, number.of.weeks))
coeff.item.matrix <- outer(coeff.item, rep(1, number.of.weeks))
sales.matrix <- coeff.week.matrix +
  coeff.item.matrix -
  coeff.price.matrix * price.matrix +
  matrix(rnorm(number.of.weeks * number.of.items, 0, normal.sd),
         number.of.items, number.of.weeks)


df <- data.frame(item = factor(as.vector(item.id.matrix)),
                 week = factor(as.vector(week.id.matrix)),
                 price = as.vector(price.matrix),
                 sales = as.vector(sales.matrix))

model <- lm(sales ~ week + item + price, data = df)
model <- lm(sales ~ week + item + factor(item) * price, data = df)

print(summary(model))

推荐答案

经过一些实验后,看来

lm(demand ~ factor(week) + factor(product) * price, data = df)

起作用.

我不知道为什么我没想到它会更早起作用.

I don't know why I hadn't thought it would work earlier.

这篇关于因子与自变量乘积的线性回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆