在循环中更改多个数据帧 [英] Change multiple dataframes in a loop

查看:187
本文介绍了在循环中更改多个数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如,我有三个数据集(在我的情况下,它们还有很多变量):

  data_frame1<  -  data.frame(a = c(1,5,3,3,2),b = c(3,6,1,5,5),c = c(4,4,1 ,9,2))
data_frame2< - data.frame(a = c(6,0,9,1,2),b = c(2,7,2,2,1),c = c(8,4,1,9,2))
data_frame2< - data.frame(a = c(0,0,1,5,1),b = c(4,1,9, 2,3),c = c(2,9,7,1,1))

每个数据帧我想添加一个由该数据帧上现有变量的变换产生的变量。我会这样做一个循环。例如:

 数据集<  -  c(data_frame1,data_frame2,data_frame3)
vars < - c(a,b,c)
(数据集中的i){
for(j in vars){
#这里我需要一个代码创建一个变量值变量值
#我认为这将工作,但没有...
get(i)$ new_var< - log(get(i)[,j])
}
}

你有一些有用的建议吗?另外,如果可以分配新的列名称(在这种情况下为 new_var )由一个字符串,所以我可以创建另一个for循环的新变量嵌套在另外两个。



希望我没有太纠结解释我的问题



提前感谢

解决方案

你可以把你的数据框在列表中,使用 lapply 逐个处理它们。所以在这种情况下不需要使用循环。



例如,您可以这样做:

  data_frame1<  -  data.frame(a = c(1,5,3,3,2),b = c(3,6,1,5,5),c = c(4 ,4,1,9,2))
data_frame2< - data.frame(a = c(6,0,9,1,2),b = c(2,7,2,2,1 ),c = c(8,4,1,9,2))
data_frame3< - data.frame(a = c(0,0,1,5,1),b = c(4, 1,9,2,3),c = c(2,9,7,1,1))

ll < - list(data_frame1,data_frame2,data_frame3)
lapply ll,function(df){
df $ log_a< - log(df $ a)##新列与日志a
df $ tans_col< - df $ a + df $ b + df $ c ##新列与一些列或任何其他
##转换
### .....
df

})

dataframe1变成:

 code> [[1]] 
abc log_a tans_col
1 1 3 4 0.0000000 8
2 5 6 4 1.6094379 15
3 3 1 1 1.0986123 5
4 3 5 9 1.0986123 17
5 2 5 2 0.6931472 9


I have, for example, this three datasets (in my case, they are many more and with a lot of variables):

data_frame1 <- data.frame(a=c(1,5,3,3,2), b=c(3,6,1,5,5), c=c(4,4,1,9,2))
data_frame2 <- data.frame(a=c(6,0,9,1,2), b=c(2,7,2,2,1), c=c(8,4,1,9,2))
data_frame2 <- data.frame(a=c(0,0,1,5,1), b=c(4,1,9,2,3), c=c(2,9,7,1,1))

on each data frame I want to add a variable resulting from a transformation of an existing variable on that data frame. I would to do this by a loop. For example:

datasets <- c("data_frame1","data_frame2","data_frame3")
vars <- c("a","b","c")
for (i in datasets){
    for (j in vars){
        # here I need a code that create a new variable with transformed values
        # I thought this would work, but it didn't...
        get(i)$new_var <- log(get(i)[,j])
    }
}

Do you have some valid suggestions about that?

Moreover, it would be great for me if it were possible also to assign the new column names (in this case new_var) by a character string, so I could create the new variables by another for loop nested in the other two.

Hope I've not been too tangled in explain my problem.

Thanks in advance.

解决方案

You can put your dataframes in a list and use lapply to process them one by one. So no need to use a loop in this case.

For example you can do this :

data_frame1 <- data.frame(a=c(1,5,3,3,2), b=c(3,6,1,5,5), c=c(4,4,1,9,2))
data_frame2 <- data.frame(a=c(6,0,9,1,2), b=c(2,7,2,2,1), c=c(8,4,1,9,2))
data_frame3 <- data.frame(a=c(0,0,1,5,1), b=c(4,1,9,2,3), c=c(2,9,7,1,1))

ll <- list(data_frame1,data_frame2,data_frame3)
lapply(ll,function(df){
  df$log_a <- log(df$a)          ## new column with the log a
  df$tans_col <- df$a+df$b+df$c  ## new column with sums of some columns or any other           
                                 ##   transformation
  ###  .....
  df

})

the dataframe1 becomes :

[[1]]
  a b c     log_a tans_col
1 1 3 4 0.0000000        8
2 5 6 4 1.6094379       15
3 3 1 1 1.0986123        5
4 3 5 9 1.0986123       17
5 2 5 2 0.6931472        9

这篇关于在循环中更改多个数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆