计算数据框中两列的平均值 [英] Compute the mean of two columns in a dataframe
问题描述
我有一个存储不同值的数据框。示例:
a $ open a $ high a $ low a $ close
1.08648 1.08707 1.08476 1.08551
1.08552 1.08623 1.08426 1.08542
1.08542 1.08572 1.08453 1.08465
1.08468 1.08566 1.08402 1.08554
1.08552 1.08565 1.08436 1.08464
1.08463 1.08543 1.08452 1.08475
1.08475 1.08504 1.08427 1.08436 $ b 1.08433 1.08438 1.08275 1.08285
1.08275 1.08353 1.08275 1.08325
1.08325 1.08431 1.08315 1.08378
1.08379 1.08383 1.08275 1.08294
1.08292 1.08338 1.08271 1.08325
我想做的是创建一个新列
a $ mean
来存储平均值<$每行c $ c> a $ high 和a $ low
。
我是这样实现的:
highlowmean<-函数(高,低){
m<-向量(mode = numeric,length = 0)
for(i in 1:length(highs)){
m [i]<-mean(highs [i],lows [i])
}
return(m)
}
a $ mean<-highlowme an(a $ high,a $ low)
但是我对R和R有点陌生一般而言,都是功能性语言,因此,我很确定有一种更有效/更简单的方法来实现这一目标。
如何实现最聪明的方法?
解决方案我们可以使用
rowMeans
a $ mean<-rowMeans(a [,c('high','low')],na.rm = TRUE)
注意:如果有NA值,则最好使用
rowMeans
例如
a<-data.frame(High = c (NA,3,2),low = c(3,NA,0))
rowMeans(a,na.rm = TRUE)
#[1] 3 3 1
并使用
+
a1<-replace(a,is.na(a),0)
(a1 [1] + a1 [2])/ 2
#高
#1 1.5
#2 1.5
#3 1.0
注意:这无法使其他答案失去光泽。它在大多数情况下都可以正常运行,而且速度很快。
I have a dataframe storing different values. Sample:
a$open a$high a$low a$close 1.08648 1.08707 1.08476 1.08551 1.08552 1.08623 1.08426 1.08542 1.08542 1.08572 1.08453 1.08465 1.08468 1.08566 1.08402 1.08554 1.08552 1.08565 1.08436 1.08464 1.08463 1.08543 1.08452 1.08475 1.08475 1.08504 1.08427 1.08436 1.08433 1.08438 1.08275 1.08285 1.08275 1.08353 1.08275 1.08325 1.08325 1.08431 1.08315 1.08378 1.08379 1.08383 1.08275 1.08294 1.08292 1.08338 1.08271 1.08325
What I want to do, is creating a new column
a$mean
storing the mean ofa$high
anda$low
for each row.Here is how I achieved that:
highlowmean <- function(highs, lows){ m <- vector(mode="numeric", length=0) for (i in 1:length(highs)){ m[i] <- mean(highs[i], lows[i]) } return(m) } a$mean <- highlowmean(a$high, a$low)
However I'm a bit new into R and in functionnal languages in general, so I'm pretty sure that there is a more efficient/simple way to achieve that.
How to achieve that the smartest way?
解决方案We can use
rowMeans
a$mean <- rowMeans(a[,c('high', 'low')], na.rm=TRUE)
NOTE: If there are NA values, it is better to use
rowMeans
For example
a <- data.frame(High= c(NA, 3, 2), low= c(3, NA, 0)) rowMeans(a, na.rm=TRUE) #[1] 3 3 1
and using
+
a1 <- replace(a, is.na(a), 0) (a1[1] + a1[2])/2 # High #1 1.5 #2 1.5 #3 1.0
NOTE: This is no way trying to tarnish the other answer. It works in most cases and is fast as well.
这篇关于计算数据框中两列的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!