将纵向值重整为宽,以用于每个主题有多行的数据 [英] Reshaping longitudinal value to wide, for data with multiple rows per subject

查看:73
本文介绍了将纵向值重整为宽,以用于每个主题有多行的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

R:从长到宽重塑数据,包括日期

R: Reshape data from long to wide, including a date

(我试图添加数据,但是格式太可怕了,我放弃了.)

(I was trying to include data, but the formatting was so horrible that I gave up).

我对重塑纵向数据有疑问.诀窍是我每个主题有多行('row_num'的范围是1到8).

I have a question on reshaping longitudinal data. The trick is that I have multiple rows per subject ('row_num' ranges from 1 to 8).

我查看了20多个Stack Overflow页面,一些其他页面以及Hadley关于重塑的帖子(和文章),但似乎找不到空白:

I've looked at over 20 Stack Overflow pages, some other pages, and Hadley's posts (and articles) on reshaping, and there seems to be a gap I can't find:

我每个科目都有多行,而且找不到如何将其转换为一行的方法.诀窍在于,我需要(假设)Q1,...,Q5,并将其变成两个或更多的 blocks 问题(Q1.1,...,Q5.1,Q1 .2,... Q5.2,...)

I have got multiple rows per subject, and can't find out how to turn that into one row. The trick is that I need to take (say) Q1,...,Q5, and turn it into two or more blocks of questions (Q1.1,...,Q5.1,Q1.2,...Q5.2, ...)

使用一种术语,'row_num'是一个主题内变量,并且所有内容似乎都基于 subject-subject 变量.

Using one terminology, 'row_num' is a within-subject variable, and everything seems to be based on between-subject variables.

这让我感到困惑和沮丧,因为恕我直言,经典的纵向到宽阔的情况是:每个主题多行(带有一个变量块)到每个主题多行(带有多个变量块).

It puzzles and frustrates me, since IMHO the classic longitudinal to wide situation = multiple rows per subject (with one block of variables) to one row per subject (with multiple blocks of variables).

有人可以指导我找到一些涵盖此内容的答案或教程吗?

Could somebody direct me to some answers or tutorials which would cover this?

非常感谢!

在编辑后,原始数据帧的内容如下所示:

Upon edit, here is what a dput of the original data frame would look like:

# A tibble: 6 x 9
  Subject_ID row_num Date          Q1 Q2_text       Q3    Q4    Q5 Q6_text      
       <dbl>   <dbl> <date>     <dbl> <chr>      <dbl> <dbl> <dbl> <chr>        
1          1       1 2019-01-01     4 Because        5     5     1 and so on    
2          1       2 2019-01-02     1 O, bother      5     4     1 NA           
3          1       3 2019-01-03     2 NA             3     4    NA NA           
4          2       1 2018-12-04    NA NA             1     4     1 NA           
5          3       1 2018-12-15     3 In addtion     5    NA     3 NA           
6          3       2 2018-12-26     1 NA             4     3     2 in conclusion

推荐答案

以下是使用tidyr的方法.

library(tidyr)
df2 <- df %>%
  # (optional) First convert all the data columns to text so 
  #   they're readable throughout the process.
  mutate_at(vars(Date:Q6_text), as.character) %>%

  # Gather into long format, where we record the column it came from 
  #   as "question" and the value it held as "value"
  gather(question, value, -Subject_ID, -row_num) %>%

  # Combine the row_num and question into a new column
  unite("question2", c("row_num", "question")) %>%

  # Use that new column to spread everything out
  spread(question2, value)


> df2
  Subject_ID     1_Date 1_Q1  1_Q2_text 1_Q3 1_Q4 1_Q5 1_Q6_text     2_Date 2_Q1 2_Q2_text 2_Q3 2_Q4 2_Q5     2_Q6_text     3_Date 3_Q1 3_Q2_text 3_Q3 3_Q4 3_Q5 3_Q6_text
1          1 2019-01-01    4    Because    5    5    1 and so on 2019-01-02    1 O, bother    5    4    1          <NA> 2019-01-03    2      <NA>    3    4 <NA>      <NA>
2          2 2018-12-04 <NA>       <NA>    1    4    1      <NA>       <NA> <NA>      <NA> <NA> <NA> <NA>          <NA>       <NA> <NA>      <NA> <NA> <NA> <NA>      <NA>
3          3 2018-12-15    3 In addtion    5 <NA>    3      <NA> 2018-12-26    1      <NA>    4    3    2 in conclusion       <NA> <NA>      <NA> <NA> <NA> <NA>      <NA>

这篇关于将纵向值重整为宽,以用于每个主题有多行的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆