如何在R中合并多个数据框列 [英] How to combine multiple data frame columns in R

查看:63
本文介绍了如何在R中合并多个数据框列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个.csv文件,其中包含我的参与者的受众特征数据.数据被编码并从我的研究数据库(REDCap)中下载,每种种族都有其自己的单独列.也就是说,每个参与者在这些列的每一列中都有一个值(如果认可,则为1;如果未认可,则为0).

I have a .csv file with demographic data for my participants. The data are coded and downloaded from my study database (REDCap) in a way that each race has its own separate column. That is, each participant has a value in each of these columns (1 if endorsed, 0 if unendorsed).

它看起来像这样:

SubjID  Sex  Age  White  AA  Asian  Other

 001    F    62   0      1   0      0
 002    M    66   1      0   0      0

我必须使用环岛路来获取我的人口统计摘要统计信息.有一种更简单的方法可以做到这一点.我的问题是,如何将这些列合并为一个列,以便每个参与者的种族值只有一个?(即重新编码为1 =白色,2 = AA等,并且仅为每个参与者提取认可的类别并将其添加到此列中?)

I have to use a roundabout way to get my demographic summary stats. There's gotta be a simpler way to do this. My question is, how can I combine these columns into one column so that there is only one value for race for each participant? (i.e. recoding so 1 = white, 2 = AA, etc, and only the endorsed category is being pulled for each participant and added to this column?)

这就是我想要的外观:

SubjID  Sex  Age  Race

001     F    62   2
002     M    66   1

推荐答案

这与我们使用REDCap的类似数据的方法大致相似.我们将 pivot_longer 用于伪变量.最终的 Race 变量也可以作为一个因素.请让我知道这是否是您的初衷.

This is more or less similar to our approach with similar data from REDCap. We use pivot_longer for dummy variables. The final Race variable could also be made a factor. Please let me know if this is what you had in mind.

pivot_longer 中添加了 names_ptypes ,以表明 Race 变量是一个因素(而不是>变异).

Added names_ptypes to pivot_longer to indicate that Race variable is a factor (instead of mutate).

library(tidyverse)

df <- data.frame(
  SubjID = c("001", "002"),
  Sex = c("F", "M"),
  Age = c(62, 66),
  White = c(0, 1),
  AA = c(1, 0),
  Asian = c(0, 0),
  Other = c(0, 0)
)

df %>%
  pivot_longer(cols = c("White", "AA", "Asian", "Other"), names_to = "Race", names_ptypes = list(Race = factor()), values_to = "Value") %>%
  filter(Value == 1) %>%
  select(-Value)

结果:

# A tibble: 2 x 4
  SubjID Sex     Age Race 
  <fct>  <fct> <dbl> <fct>
1 001    F        62 AA   
2 002    M        66 White

这篇关于如何在R中合并多个数据框列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆