根据R中行的内容重新组织数据框架元素 [英] Reorganize data frame elements depending on the content of the rows in R

查看:88
本文介绍了根据R中行的内容重新组织数据框架元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个数据集:

df <- structure(list(V1 = c("B1D01", "B1D01", "B1D01", "B1D01", "B1D01", 
"B1D01", "U0155"), V2 = c("U0155", "U0155", "U0155", "U0155", 
"U0155", "U0155", "U3003"), V3 = c("U3003", "U3003", "C1B00", 
"U3003", "U3003", "U3003", "C1B00"), V4 = c("C1B00", "C1B00", 
"U0073", "C1B00", "C1B00", "C1B00", "P037D"), V5 = c("P037D", 
"P037D", NA, "P037D", "P037D", "P037D", "P0616"), V6 = c("P0616", 
"P0616", NA, "P0616", "P0616", "P0616", "P0562"), V7 = c("P0562", 
"P0562", NA, "P0562", "P0562", "P0562", "U0073"), V8 = c("U0073", 
"U0073", NA, "U0073", "U0073", "U0073", NA)), .Names = c("V1", 
"V2", "V3", "V4", "V5", "V6", "V7", "V8"), row.names = 1719:1725, class = "data.frame")

当我 print(df)

        V1    V2    V3    V4    V5    V6    V7    V8
1719 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1720 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1721 B1D01 U0155 C1B00 U0073  <NA>  <NA>  <NA>  <NA>
1722 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1723 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1724 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1725 U0155 U3003 C1B00 P037D P0616 P0562 U0073  <NA>

如您所见,这些代码中混合使用。例如, U3003 主要在 V3 中,但也可以在 V2 <中显示/ code>(最后一行)。

As you can observe, there is a mix in these codes. For instance, U3003 is primarily in V3, but it can also be shown in V2 (last row).

我想在以下条件下重新组织此数据框:

I would like to reorganize this data frame with these conditions:


  • 每个代码可能放在一列中。

  • 该列的名称应该是代码的名称。

  • 如果代码多于8列,则列数可能反映代码数。

  • 单元格值可能会保留代码名称。

  • 如果代码未连续显示,则必须显示 NA

  • Each code might be placed in one column.
  • Names of the column should be the name of the codes.
  • If there are more codes than 8 columns, number of columns might reflect number of codes.
  • The cell values might keep the name of the codes.
  • If the code is not present in a row, NA must appear.

请注意,我的原始数据框中包含的行比从原始示例中提取的小示例要多。

Be aware that my original data frame contains much more rows than this small example extracted from the original.

推荐答案

我发现的最佳方法是按摩数据框,旋转为更长的形式,然后将其恢复为初始形式:

The best way I found is to 'massage' the dataframe, pivoting to a longer form, and then bring it back to the initial form:

library(tidyverse)

df %>% 
  rownames_to_column() %>% 
  pivot_longer(-rowname, values_drop_na = TRUE) %>% 
  pivot_wider(rowname, names_from = value, values_from = value)

#> # A tibble: 7 x 9
#>   rowname B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#>   <chr>   <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1719    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 2 1720    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 3 1721    B1D01 U0155 <NA>  C1B00 <NA>  <NA>  <NA>  U0073
#> 4 1722    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 5 1723    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 6 1724    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 7 1725    <NA>  U0155 U3003 C1B00 P037D P0616 P0562 U0073

于2020-04-03创建href = https://reprex.tidyverse.org rel = nofollow noreferrer> reprex包(v0.3.0)

Created on 2020-04-03 by the reprex package (v0.3.0)

这篇关于根据R中行的内容重新组织数据框架元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆