将长格式转换为宽格式 [英] Converting long format to wide format

查看:12
本文介绍了将长格式转换为宽格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

具有如下所示的拷贝数长格式,其中每个样本在其自己的基因组范围内有其自己的拷贝数值(SegVal)

> head(long)
   chromosome     start       end segVal                       sample
1:       chr1   3218923 116319008      2 TCGA-05-4417-01A-22D-1854-01
2:       chr1 116324707 120523902      1 TCGA-05-4417-01A-22D-1854-01
3:       chr1 149879545 247812431      4 TCGA-05-4417-01A-22D-1854-01
4:       chr1   3218923 104393357      2 TCGA-06-0644-01A-02D-0310-01
5:       chr1 104418619 149879545      1 TCGA-06-0644-01A-02D-0310-01
6:       chr1 149885583 247812431      2 TCGA-06-0644-01A-02D-0310-01

我如何将其转换为宽格式,以便样本在列中具有它们的值(不过,如果我没有错,基因组范围应该是常见的),如

> head(wide)
 chr     start       end TCGA-05-4417-01A-22D-1854-01 TCGA-06-0644-01A-02D-0310-01 TCGA-06-0644-01A-02D-0310-01
 chr1  24254002  24291000          2          2         2
 chr3  47421002  49068000          1          0         0
 chr4  69204002  70320000          0          0         1
 chr5  58263002  59785000          0          1         1
 chr6  29010002  33287000          2          2         2
 chr7 110240002 111354000          0          0         0
>

推荐答案

这对您有效吗?

library(tidyr)

options(scipen = 999)

df <- structure(list(chromosome = c("chr1", "chr1", "chr1", "chr1", 
"chr1", "chr1"), start = c(3218923L, 116324707L, 149879545L, 
3218923L, 104418619L, 149885583L), end = c(116319008L, 120523902L, 
247812431L, 104393357L, 149879545L, 247812431L), segVal = c(2L, 
1L, 4L, 2L, 1L, 2L), sample = c("TCGA-05-4417-01A-22D-1854-01", 
"TCGA-05-4417-01A-22D-1854-01", "TCGA-05-4417-01A-22D-1854-01", 
"TCGA-06-0644-01A-02D-0310-01", "TCGA-06-0644-01A-02D-0310-01", 
"TCGA-06-0644-01A-02D-0310-01")), class = "data.frame", row.names = c("1:", 
"2:", "3:", "4:", "5:", "6:"))

df <- df %>% 
pivot_wider(names_from = sample, values_from = segVal, values_fill = 0)

#> # A tibble: 6 x 5
#>   chromosome    start      end `TCGA-05-4417-01A-22D-1… `TCGA-06-0644-01A-02D-0…
#>   <chr>         <int>    <int>                    <int>                    <int>
#> 1 chr1         3218923   116319008                    2                        0                        
#> 2 chr1         116324707 120523902                    1                        0
#> 3 chr1         149879545 247812431                    4                        0
#> 4 chr1         3218923   104393357                    0                        2
#> 5 chr1         104418619 149879545                    0                        1
#> 6 chr1         149885583 247812431                    0                        2

reprex package(v0.3.0)于2020-08-24创建

这篇关于将长格式转换为宽格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆