R中的多项式特征扩展 [英] Polynomial feature expansion in R

查看:42
本文介绍了R中的多项式特征扩展的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想对数据框进行多项式特征扩展——例如,用 (x1, x2, x3) 对 df 进行二次扩展应该给出带 (x1, x2, x3, x1^) 的 df2、x2^2、x3^2、x1x2、x1x3、x2x3).我目前正在使用 poly(df$x1, df$x2, df$x3, degree=2, raw=T) 但是如果我有大量的列,这需要不必要的输入量.(并且 poly(df[,1:20], degree=2, raw=T) 不起作用.)这样做的最佳方法是什么?

I'd like to do a polynomial feature expansion for a data frame -- for example, a quadratic expansion of a df with (x1, x2, x3) should give a df with (x1, x2, x3, x1^2, x2^2, x3^2, x1x2, x1x3, x2x3). I'm currently using poly(df$x1, df$x2, df$x3, degree=2, raw=T) but this requires an unnecessary amount of typing if I have large number of columns. (And poly(df[,1:20], degree=2, raw=T) doesn't work.) What's the best way to do this?

poly 的列太多(vector is too large 错误).让它与一个简单的 for 循环一起工作:

I have too many columns for poly (vector is too large error). Got it to work with a simple for loop:

polyexp = function(df){
  df.polyexp = df
  colnames = colnames(df)
  for (i in 1:ncol(df)){
    for (j in i:ncol(df)){
      colnames = c(colnames, paste0(names(df)[i],'.',names(df)[j]))
      df.polyexp = cbind(df.polyexp, df[,i]*df[,j])
    }
  }
  names(df.polyexp) = colnames
  return(df.polyexp)
}

只需添加额外的循环即可计算高阶项.

Just add additional loops to compute higher-order terms.

推荐答案

你可以用 do.call 做到这一点:

You could do this with do.call:

do.call(poly, c(lapply(1:20, function(x) dat[,x]), degree=2, raw=T))

基本上 do.call 将要调用的函数(在您的情况下为 poly)作为第一个参数,并将列表作为第二个参数.此列表的每个元素然后作为参数传递给您的函数.在这里,我们制作了一个列表,其中包含您要处理的所有列(我使用 lapply 来获取该列表,而无需过多输入),然后是您要传递的两个附加参数.

Basically do.call takes as the first argument the function to be called (poly in your case) and as a second argument a list. Each element of this list is then passed as an argument to your function. Here we make a list containing all of the columns you want to process (I've used lapply to get that list without too much typing) followed by the two additional arguments you want to pass.

通过一个简单的例子来查看它的工作情况:

To see it working on a simple example:

dat <- data.frame(x=1:5, y=1:5, z=2:6)
do.call(poly, c(lapply(1:3, function(x) dat[,x]), degree=2, raw=T))
#      1.0.0 2.0.0 0.1.0 1.1.0 0.2.0 0.0.1 1.0.1 0.1.1 0.0.2
# [1,]     1     1     1     1     1     2     2     2     4
# [2,]     2     4     2     4     4     3     6     6     9
# [3,]     3     9     3     9     9     4    12    12    16
# [4,]     4    16     4    16    16     5    20    20    25
# [5,]     5    25     5    25    25     6    30    30    36
# attr(,"degree")
# [1] 1 2 1 2 2 1 2 2 2

这篇关于R中的多项式特征扩展的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆