使用dplyr在列选择列上添加列 [英] add column with row wise mean over selected columns using dplyr

查看:115
本文介绍了使用dplyr在列选择列上添加列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,其中包含几个在不同时间点测量的变量(例如, test1_tp1 test1_tp2 test1_tp3 test2_tp1 test2_tp2 ,... )



我正在尝试使用 dplyr 向计算行的数据框添加一个新列明智的意思是选择这些列(例如,对于 test1 的所有时间点的意思)。


  1. 即使使用用于计算明确命名列的平均值的语法,我也很难。没有成功的尝试是:

data%>%...%>%mutate test1_mean = mean(test1_tp1,test1_tp2,test1_tp3,na.rm = TRUE)


    <我还想使用正则表达式/通配符来选择列名称,所以像

data%> ;%...%>%mutate(test1_mean = mean(matches(test1 _。*),na.rm = TRUE)

解决方案

您可以使用 starts_with 选择某个字符串

  data%>%
mutate(test1 = select(。,starts_with(test1_ ))%>%
rowMeans(na.rm = TRUE))


I have a data frame which contains several variables which got measured at different time points (e.g., test1_tp1, test1_tp2, test1_tp3, test2_tp1, test2_tp2,...).

I am now trying to use dplyr to add a new column to a data frame that calculates the row wise mean over a selection of these columns (e.g., mean over all time points for test1).

  1. I struggle even with the syntax for calculating the mean over explicitly named columns. What I tried without success was:

data %>% ... %>% mutate(test1_mean = mean(test1_tp1, test1_tp2, test1_tp3, na.rm = TRUE)

  1. I would further like to use regex/wildcards to select the column names, so something like

data %>% ... %>% mutate(test1_mean = mean(matches("test1_.*"), na.rm = TRUE)

解决方案

You can use starts_with inside select to find all columns starting with a certain string.

data %>%
  mutate(test1 = select(., starts_with("test1_")) %>%
           rowMeans(na.rm = TRUE))

这篇关于使用dplyr在列选择列上添加列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆