基于对应列标识符R的数据框重塑 [英] Reshape Data Frame Based on Corresponding Column's Identifier R

查看：69 发布时间：2020/10/16 20:11:04 r dplyr tidyr data-cleaning

本文介绍了基于对应列标识符R的数据框重塑的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图通过折叠与第2列匹配的相应列值来重塑两列数据框-在这种情况下，将股票代号变成自己的唯一行，同时使第1列的内容成为数据的字段对应于那些行情记录器自己的列。例如，查看一个小示例，因为它是一个具有500个行情自动报价和4个字段的数据框：

I'm tried to reshape a two column data frame by collapse the corresponding column values that match in column 2 - in this case ticker symbols to their own unique row while making the contents of column 1 which are the fields of data that correspond to those tickers their own columns. See for example a small sample since it's a data frame with 500 tickers and 4 fields:

                            test22 Ticker
         Current SharePrice $6.57    MFM
                Current NAV $7.11    MFM
  Current Premium/Discount -7.59%    MFM
         52WkAvg SharePrice $6.55    MFM
                52WkAvg NAV $7.21    MFM
 52WkAvg Premium/Discount -9.19%    MFM
        52WkHigh SharePrice $6.88    MFM
               52WkHigh NAV $7.34    MFM
 52WkHigh Premium/Discount -5.88%    MFM
        52WkLow SharePrice $6.05    MFM
               52WkLow NAV $7.03    MFM
52WkLow Premium/Discount -14.43%    MFM
        Current SharePrice $4.84    CXE
               Current NAV $5.21    CXE
 Current Premium/Discount -7.10%    CXE
        52WkAvg SharePrice $4.91    CXE
               52WkAvg NAV $5.29    CXE
 52WkAvg Premium/Discount -7.26%    CXE
       52WkHigh SharePrice $5.31    CXE
              52WkHigh NAV $5.37    CXE
52WkHigh Premium/Discount -1.12%    CXE
        52WkLow SharePrice $4.58    CXE
               52WkLow NAV $5.16    CXE
52WkLow Premium/Discount -11.92%    CXE

理想情况下，重新格式化转换后的行情栏是唯一的行，行名位于行名在这种情况下，包含12个对应的列，其中包含 test22列的内容，但在此阶段不会导入它们自己的名称。

Ideally, the ticker column after the reformatting transformation is a unique row with the ticker at the row name with, in this case, 12 corresponding columns containing the contents of the "test22" column but their own names at this stage aren't import. Help is much appreciated!

推荐答案

我将这个问题解释为将长数据转换为宽格式。这个问题最难的部分是将数字与描述分开。
完成此操作后，便使用 spread 函数转换为宽。

I interpreted this problem as turning this long data into a wide format. The hardest part of this problem is separating the number from the description. Once that was done it was using the spread function to convert to wide.

df<-structure(list(test22 = structure(c(24L, 20L, 22L, 6L, 2L, 4L, 
12L, 8L, 10L, 18L, 14L, 16L, 23L, 19L, 21L, 5L, 1L, 3L, 11L, 
7L, 9L, 17L, 13L, 15L), .Label = c("52WkAvg NAV $5.29", "52WkAvg NAV $7.21", 
"52WkAvg Premium/Discount -7.26%", "52WkAvg Premium/Discount -9.19%", 
"52WkAvg SharePrice $4.91", "52WkAvg SharePrice $6.55", "52WkHigh NAV $5.37", 
"52WkHigh NAV $7.34", "52WkHigh Premium/Discount -1.12%", "52WkHigh Premium/Discount -5.88%", 
"52WkHigh SharePrice $5.31", "52WkHigh SharePrice $6.88", "52WkLow NAV $5.16", 
"52WkLow NAV $7.03", "52WkLow Premium/Discount -11.92%", "52WkLow Premium/Discount -14.43%", 
"52WkLow SharePrice $4.58", "52WkLow SharePrice $6.05", "Current NAV $5.21", 
"Current NAV $7.11", "Current Premium/Discount -7.10%", "Current Premium/Discount -7.59%", 
"Current SharePrice $4.84", "Current SharePrice $6.57"), class = "factor"), 
    Ticker = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
    ), .Label = c("CXE", "MFM"), class = "factor")), class = "data.frame", row.names = c(NA, 
-24L))


library(tidyr)

#separate the number for the text
df2<-separate(df, test22, into=c("key", "value"), sep=" (?=[$]*[-\\.0-9]+%*)", extra="merge")

#spread from long to wide
spread(df2, key=key, value=value)

#columns are abridged for clarity
#Ticker 52WkAvg NAV 52WkAvg Premium/Discount 52WkAvg SharePrice 52WkHigh NAV 52WkHigh Premium/Discount 52WkHigh ...
#CXE       $5.29                   -7.26%              $4.91        $5.37                    -1.12%                     
#MFM       $7.21                   -9.19%              $6.55        $7.34                    -5.88%

这篇关于基于对应列标识符R的数据框重塑的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

基于对应列标识符R的数据框重塑 [英] Reshape Data Frame Based on Corresponding Column's Identifier R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

基于对应列标识符R的数据框重塑 [英] Reshape Data Frame Based on Corresponding Column&#39;s Identifier R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

基于对应列标识符R的数据框重塑 [英] Reshape Data Frame Based on Corresponding Column's Identifier R

登录关闭