在数据框列表中添加一列 [英] Add a column in a list of data frames

查看:168
本文介绍了在数据框列表中添加一列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

 #我想在列表表中的每个数据框中添加一列,我的数据框列表
df< - list(df1,df2,df3,df4)

#compute stats
stats< - function(d)do.call(rbind ,lapply(split(d,d [,2]),function(x)data.frame(Nb = length(x $ Year),Mean = mean(x $ A),SD = sd(x $ A))) )

#应用数据框列表
表< - lapply(df,stats)

这个我称之为Source的列包括数据帧的名称以及Nb,Mean和SD变量。因此,变量Source应该包含df1,df1,df1 ...对于我的表[1],等等。



有没有办法我可以添加到我的代码

解决方案

这是一种不同的做法:



首先,我们从一些可重复的数据开始:

  set.seed(1)
n = 10
dat < - list(data.frame(a = rnorm(n),b = sample(1:3,n,TRUE)),
data.frame(a = rnorm(n),b = sample :3,n,TRUE)),
data.frame(a = rnorm(n),b = sample(1:3,n,TRUE)),
data.frame(a = rnorm n),b = sample(1:3,n,TRUE)))

然后,一个将数据添加到data.frame的函数。明显的候选人是内的。您想要计算的特定事物是特定类别内每个观察值的常数值。要这样做,请为您要添加的每个列使用 ave 。这是您的新功能:

  stat < -  function(d){
within(d,{
b = ave(a,b,FUN = length)
Mean = ave(a,b,FUN = mean)
SD = ave(a,b,FUN = sd)
} )
}

然后只需 lapply 它到你的data.frames列表:

  lapply(dat,stat)

正如你所看到的,列是适当添加的:

 > str(lapply(dat,stat))
4
$的列表:'data.frame':10个obs。的5个变量:
.. $ a:num [1:10] -0.626 0.184 -0.836 1.595 0.33 ...
.. $ b:int [1:10] 3 1 2 1 1 2 1 2 3 2
.. $ SD:num [1:10] 0.85 0.643 0.738 0.643 0.643 ...
.. $平均值:num [1:10] -0.0253 0.649 -0.3058 0.649 0.649。
.. $ Nb:num [1:10] 2 4 4 4 4 4 4 4 2 4
$:'data.frame':10 obs。的5个变量:
.. $ a:num [1:10] -0.0449 -0.0162 0.9438 0.8212 0.5939 ...
.. $ b:int [1:10] 2 3 2 1 1 1 1 2 2 2
.. $ SD:num [1:10] 1.141 NA 1.141 0.136 0.136 ...
.. $平均值:num [1:10] -0.0792 -0.0162 -0.0792 0.7791 0.7791 ...
.. $ Nb:num [1:10] 5 1 5 4 4 4 4 5 5 5
$:'data.frame':10 obs。的5个变量:
.. $ a:num [1:10] 1.3587 -0.1028 0.3877 -0.0538 -1.3771 ...
.. $ b:int [1:10] 2 3 2 1 3 1 3 1 1 1
.. $ SD:num [1:10] 0.687 0.668 0.687 0.635 0.668 ...
.. $平均值:num [1:10] 0.873 -0.625 0.873 0.267 -0.625 ...
.. $ Nb:num [1:10] 2 3 2 5 3 5 3 5 5 5
$:'data.frame':10 obs。的5个变量:
.. $ a:num [1:10] -0.707 0.365 0.769 -0.112 0.881 ...
.. $ b:int [1:10] 3 3 2 2 1 1 3 1 2 2
.. $ SD:num [1:10] 0.593 0.593 1.111 1.111 0.297 ...
.. $平均值:num [1:10] -0.318 -0.318 0.24 0.24 0.54。 ..
.. $ Nb:num [1:10] 3 3 4 4 3 3 3 3 4 4


I want to add a column to each of my data frames in my list table after I do this code :

#list of my dataframes
df <- list(df1,df2,df3,df4)

#compute stats
stats <- function(d) do.call(rbind, lapply(split(d, d[,2]), function(x) data.frame(Nb= length(x$Year), Mean=mean(x$A), SD=sd(x$A)  )))

#Apply to list of dataframes
table <- lapply(df, stats)

This column which I call Source for example, include the names of my dataframes along with Nb, Mean and SD variables. So the variable Source should contain df1,df1,df1... for my table[1], and so on.

Is there anyway I can add it in my code above?

解决方案

Here's a different way of doing things:

First, let's start with some reproducible data:

set.seed(1)
n = 10
dat <- list(data.frame(a=rnorm(n), b=sample(1:3,n,TRUE)),
            data.frame(a=rnorm(n), b=sample(1:3,n,TRUE)),
            data.frame(a=rnorm(n), b=sample(1:3,n,TRUE)),
            data.frame(a=rnorm(n), b=sample(1:3,n,TRUE)))

Then, you want a function that adds columns to a data.frame. The obvious candidate is within. The particular things you want to calculate are constant values for each observation within a particular category. To do that, use ave for each of the columns you want to add. Here's your new function:

stat <- function(d){
    within(d, {
        Nb = ave(a, b, FUN=length)
        Mean = ave(a, b, FUN=mean)
        SD = ave(a, b, FUN=sd)
    })        
}

Then just lapply it to your list of data.frames:

lapply(dat, stat)

As you can see, columns are added as appropriate:

> str(lapply(dat, stat))
List of 4
 $ :'data.frame':       10 obs. of  5 variables:
  ..$ a   : num [1:10] -0.626 0.184 -0.836 1.595 0.33 ...
  ..$ b   : int [1:10] 3 1 2 1 1 2 1 2 3 2
  ..$ SD  : num [1:10] 0.85 0.643 0.738 0.643 0.643 ...
  ..$ Mean: num [1:10] -0.0253 0.649 -0.3058 0.649 0.649 ...
  ..$ Nb  : num [1:10] 2 4 4 4 4 4 4 4 2 4
 $ :'data.frame':       10 obs. of  5 variables:
  ..$ a   : num [1:10] -0.0449 -0.0162 0.9438 0.8212 0.5939 ...
  ..$ b   : int [1:10] 2 3 2 1 1 1 1 2 2 2
  ..$ SD  : num [1:10] 1.141 NA 1.141 0.136 0.136 ...
  ..$ Mean: num [1:10] -0.0792 -0.0162 -0.0792 0.7791 0.7791 ...
  ..$ Nb  : num [1:10] 5 1 5 4 4 4 4 5 5 5
 $ :'data.frame':       10 obs. of  5 variables:
  ..$ a   : num [1:10] 1.3587 -0.1028 0.3877 -0.0538 -1.3771 ...
  ..$ b   : int [1:10] 2 3 2 1 3 1 3 1 1 1
  ..$ SD  : num [1:10] 0.687 0.668 0.687 0.635 0.668 ...
  ..$ Mean: num [1:10] 0.873 -0.625 0.873 0.267 -0.625 ...
  ..$ Nb  : num [1:10] 2 3 2 5 3 5 3 5 5 5
 $ :'data.frame':       10 obs. of  5 variables:
  ..$ a   : num [1:10] -0.707 0.365 0.769 -0.112 0.881 ...
  ..$ b   : int [1:10] 3 3 2 2 1 1 3 1 2 2
  ..$ SD  : num [1:10] 0.593 0.593 1.111 1.111 0.297 ...
  ..$ Mean: num [1:10] -0.318 -0.318 0.24 0.24 0.54 ...
  ..$ Nb  : num [1:10] 3 3 4 4 3 3 3 3 4 4

这篇关于在数据框列表中添加一列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆