将函数应用于data.frame的每一列并组织输出 [英] Apply a function to each column of a data.frame and organize the output

查看：66 发布时间：2021/4/9 18:56:40 r function dataframe apply sapply

本文介绍了将函数应用于data.frame的每一列并组织输出的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有这个向量:

  x<-c(5,2，-4，-6，-2,1,4,2，-3，-6，-1,8,9,5，-6，-11)

我使用此功能:

  myfunction<-函数(x){n<-长度(x)fx<-数字(n)fx [1] <-min(x [1]，0)for(i在2:n中){fx [i]<-min(0，fx [i-1] + x [i])}外汇x_min< -min(x)fx_min<-分钟(fx)fx_05<-数值(n)fx_05 [1]<-分钟(fx [1]，0)为(2:n中的i){如果(sum(fx_05 [i-1] + x [i])> 0){fx_05 [i]<-0} else if((sum(fx_05 [i-1] + x [i]))<(fx_min * 0.5)){fx_05 [i]<--(fx_min * 0.5)} else {fx_05 [i]<-sum(fx_05 [i-1] + x [i])}}fx_05as.data.frame(矩阵(c(x，fx_05)，ncol = 2))}xx<-myfunction(x)

数据框 xx 是

  V1 V21 5 0.02 2 0.03 -4 -4.04 -6 -8.55 -2 -8.s6 1 -7.57 4 -3.58 2 -1.59 -3 -4.510 -6 -8.511 -1 -8.512 8 -0.513 9 0.014 5 0.015 -6 -6.016 -11 -8.5`

我想将此功能应用于data.frame:

  df<-data.frame(x<-c(5,2，-4，-6，-2,1,4,2，-3，-6，-1,8，9,5，-6，-11)，y<-c(5,2，-4，-6，-2,1,4,2，-3，-6，-1,8,9,5，-6，-11)，z<-c(5,2，-4，-6，-2,1,4,2，-3，-6，-1,8,9,5，-6，-11))

使用:

 输出<-myfunction(df)

它不起作用，并使用:

 输出<-data.frame(sapply(df，myfunction))

data.frame输出的格式不正确.data.frame的每个原始列应为2列.

解决方案

在这种情况下，您想使用 lapply .它将处理data.frame的每一列，因为它实际上是等长向量的列表，并分别返回两列data.frame.

  x<-lapply(df，myfunction)

此外， sapply 也可以正常工作.唯一的区别是，一开始它看起来有所不同.有关所有解决方案之间的区别，请参见 print(x).

  x<-sapply(df，myfunction)

之后，您可能希望再次将它们从列表组合到data.frame.您可以使用 do.call

  df2<-do.call(cbind，x)

这会弄乱列名.您可以使用名称

进行更改

 名称(df2)<-NULLdf2#1 5 0.0 5 0.0 5 0.0#2 2 0.0 2 0.0 2 0.0#3 -4 -4.0 -4 -4.0 -4 -4.0#4 -6 -8.5 -6 -8.5 -6 -8.5#....

旁注:

如果没有data.frame而是矩阵作为输入，则另一个选项是 apply ，其中 MARGIN = 2 .

  x<-apply(df，MARGIN = 2，myfunction)

尽管在本示例中，它也能正常工作，但是在向量中具有不同数据类型时，您会遇到麻烦，因为在应用函数之前它将data.frame转换为矩阵.因此，不建议这样做.有关详细信息，请参见此详细且易于理解的帖子！

！

进一步阅读:
Hadley Wickham的Advanced R .另请参阅此站点上有关数据类型的部分.
Peter Werner的博客文章

在此帖子上，我非常感谢 @Gregor 的输入.

I have this vector:

 x <- c(5,2,-4,-6,-2,1,4,2,-3,-6,-1,8,9,5,-6,-11)

I use this function:

myfunction <- function(x){
     n <- length(x)
     fx <- numeric(n)
     fx[1] <- min(x[1],0)
     for(i in 2:n){fx[i] <- min(0,fx[i-1]+x[i])}
     fx

     x_min <-min(x)
     fx_min <- min(fx)

     fx_05 <- numeric(n)
     fx_05[1] <- min(fx[1],0)
     for (i in 2:n) {
       if (sum(fx_05[i-1]+x[i])>0) {  
          fx_05[i] <- 0
       } else if ((sum(fx_05[i-1]+x[i]))<(fx_min*0.5)) {
          fx_05[i] <- (fx_min*0.5)
       } else { fx_05[i] <- sum(fx_05[i-1]+x[i]) }
     }
     fx_05
     as.data.frame(matrix(c(x, fx_05), ncol = 2 ))
}
xx <- myfunction(x)

The dataframe xx is

    V1   V2
1    5  0.0
2    2  0.0
3   -4 -4.0
4   -6 -8.5
5   -2 -8.s
6    1 -7.5
7    4 -3.5
8    2 -1.5
9   -3 -4.5
10  -6 -8.5
11  -1 -8.5
12   8 -0.5
13   9  0.0
14   5  0.0
15  -6 -6.0
16 -11 -8.5`

I would like to apply this function to a data.frame :

df <- data.frame(x <- c(5,2,-4,-6,-2,1,4,2,-3,-6,-1,8,9,5,-6,-11),
                   y <- c(5,2,-4,-6,-2,1,4,2,-3,-6,-1,8,9,5,-6,-11),
                   z <- c(5,2,-4,-6,-2,1,4,2,-3,-6,-1,8,9,5,-6,-11))

Using:

output <- myfunction(df)

It doesn't work, and using:

outputs <- data.frame(sapply(df, myfunction))

the form of the data.frame output is not correct. It should be 2 columns for each original column of the data.frame.

解决方案

In this case, you would like to use lapply. It will handle each column of the data.frame, as it actually is a list of equal-length vectors, and return a two column data.frame each.

x <- lapply(df, myfunction)

Also, sapply works just fine. The only difference is that it looks different at the beginning. See print(x) for the difference between all solutions.

x <- sapply(df, myfunction)

Afterwards you probably want to combine them from a list to a data.frame again. You can do this with do.call

df2 <- do.call(cbind, x)

This will mess up the column names. You can change these using names

names(df2) <- NULL
df2
# 1    5  0.0   5  0.0   5  0.0
# 2    2  0.0   2  0.0   2  0.0
# 3   -4 -4.0  -4 -4.0  -4 -4.0
# 4   -6 -8.5  -6 -8.5  -6 -8.5
# ....

Side Note:

If you don't have a data.frame but a matrix as input, another option would be apply with the with MARGIN = 2.

x <- apply(df, MARGIN = 2, myfunction)

Although in this example, it works as well, you will run into trouble when having differing data types across your vectors as it converts the data.frame to a matrix before applying the function. Therefore it is not recommended. More info on that can be found in this detailed and easy-to-understand post!

Further reading on this:
Hadley Wickham's Advanced R. Also check out the section on data types on this site.
Peter Werner's blog post

I greatly appreciate the input of @Gregor on this post.

这篇关于将函数应用于data.frame的每一列并组织输出的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将函数应用于data.frame的每一列并组织输出 [英] Apply a function to each column of a data.frame and organize the output

问题描述

旁注:

Side Note:

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

将函数应用于data.frame的每一列并组织输出 [英] Apply a function to each column of a data.frame and organize the output

问题描述

旁注:

Side Note:

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭