计算数据框中两列的平均值 [英] Compute the mean of two columns in a dataframe

查看:200
本文介绍了计算数据框中两列的平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个存储不同值的数据框。示例:

  a $ open a $ high a $ low a $ close 

1.08648 1.08707 1.08476 1.08551
1.08552 1.08623 1.08426 1.08542
1.08542 1.08572 1.08453 1.08465
1.08468 1.08566 1.08402 1.08554
1.08552 1.08565 1.08436 1.08464
1.08463 1.08543 1.08452 1.08475
1.08475 1.08504 1.08427 1.08436 $ b 1.08433 1.08438 1.08275 1.08285
1.08275 1.08353 1.08275 1.08325
1.08325 1.08431 1.08315 1.08378
1.08379 1.08383 1.08275 1.08294
1.08292 1.08338 1.08271 1.08325


我想做的是创建一个新列 a $ mean 来存储平均值<$每行c $ c> a $ high 和 a $ low



我是这样实现的:

  highlowmean<-函数(高,低){
m<-向量(mode = numeric,length = 0)
for(i in 1:length(highs)){
m [i]<-mean(highs [i],lows [i])
}
return(m)
}

a $ mean<-highlowme an(a $ high,a $ low)

但是我对R和R有点陌生一般而言,都是功能性语言,因此,我很确定有一种更有效/更简单的方法来实现这一目标。



如何实现最聪明的方法?

解决方案

我们可以使用 rowMeans

  a $ mean<-rowMeans(a [,c('high','low')],na.rm = TRUE)

注意:如果有NA值,则最好使用 rowMeans



例如

  a<-data.frame(High = c (NA,3,2),low = c(3,NA,0))
rowMeans(a,na.rm = TRUE)
#[1] 3 3 1

并使用 +

  a1<-replace(a,is.na(a),0)
(a1 [1] + a1 [2])/ 2
#高
#1 1.5
#2 1.5
#3 1.0

注意:这无法使其他答案失去光泽。它在大多数情况下都可以正常运行,而且速度很快。


I have a dataframe storing different values. Sample:

a$open  a$high  a$low   a$close

1.08648 1.08707 1.08476 1.08551
1.08552 1.08623 1.08426 1.08542
1.08542 1.08572 1.08453 1.08465
1.08468 1.08566 1.08402 1.08554
1.08552 1.08565 1.08436 1.08464
1.08463 1.08543 1.08452 1.08475
1.08475 1.08504 1.08427 1.08436
1.08433 1.08438 1.08275 1.08285
1.08275 1.08353 1.08275 1.08325
1.08325 1.08431 1.08315 1.08378
1.08379 1.08383 1.08275 1.08294
1.08292 1.08338 1.08271 1.08325

What I want to do, is creating a new column a$mean storing the mean of a$high and a$low for each row.

Here is how I achieved that:

highlowmean <- function(highs, lows){
  m <- vector(mode="numeric", length=0)
  for (i in 1:length(highs)){
    m[i] <- mean(highs[i], lows[i])
  }
  return(m)
}

a$mean <- highlowmean(a$high, a$low)

However I'm a bit new into R and in functionnal languages in general, so I'm pretty sure that there is a more efficient/simple way to achieve that.

How to achieve that the smartest way?

解决方案

We can use rowMeans

 a$mean <- rowMeans(a[,c('high', 'low')], na.rm=TRUE)

NOTE: If there are NA values, it is better to use rowMeans

For example

 a <- data.frame(High= c(NA, 3, 2), low= c(3, NA, 0))
 rowMeans(a, na.rm=TRUE)    
 #[1] 3 3 1

and using +

 a1 <- replace(a, is.na(a), 0)
 (a1[1] + a1[2])/2
#  High
#1  1.5
#2  1.5
#3  1.0

NOTE: This is no way trying to tarnish the other answer. It works in most cases and is fast as well.

这篇关于计算数据框中两列的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆