R - 在多个数据框上应用函数 [英] R - Apply function on multiple data frames
问题描述
我在 R 中加载了几个数据表作为数据框:
I loaded several data sheets as data frames in R with:
temp = list.files(pattern="*.csv")
for (i in 1:length(temp)) assign(temp[i], read.csv(temp[i]))
现在我想对所有数据框应用一个函数.我想过类似的事情:
Now I would like to apply a function on all data frames. I thought about something like:
kappa1_mean_h_stem <- lapply(df.list, mean_h_stem)
其中 df.list
包含所有数据框的列表.
Where df.list
contains a list of all data frames.
mean_h_stem <- function(x) {
mean(x[1,3])
}
我希望函数返回特定列的平均值.但它告诉我,我的维数有误.
I want the function to return the mean for a specific column. But it tells me, I had the wrong number of dimensions.
推荐答案
你的错误的原因是我认为你通过了 x[1,3]
它将从第一行获取值仅第三列.我假设您想计算所有 data.frames
中同一列的平均值,因此我对您的函数进行了轻微修改,以便您可以传递数据和列的名称或位置:
The reason for your error is I think that you passed x[1,3]
which would get the value from the first row of the third column only. I assume you want to calculate the mean of the same column across all the data.frames
, so I made a slight modification to your function so you can pass data and the name or position of the column:
mean_h_stem <- function(dat, col){ mean(dat[,col], na.rm=T)}
可以使用整数选择列:
lapply(df.list, mean_h_stem, 2)
或列名,用字符串表示:
Or a column name, expressed as a string:
lapply(df.list, mean_h_stem, 'col_name')
像这样传递第二个参数会感觉有点不直观,所以你可以用更清晰的方式来做:
Passing the second argument like this can feel a little unintuitive, so you can do it in a clearer way:
lapply(df.list, function(x) mean_h_stem(dat = x, col ='col_name'))
根据您的问题,这一次仅适用于单列,但您可以轻松修改它以执行多个操作.
This will only work for single columns at a time per your question, but you could easily modify this to do multiple.
顺便说一句,要读取 csv 文件,您还可以使用 lapply
和 read.csv
:
As an aside, to read in the csv files, you could also use an lapply
with read.csv
:
temp <- list.files(pattern='*.csv')
df.list <- lapply(temp, read.csv)
这篇关于R - 在多个数据框上应用函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!