R ggplot中该行列表的直方图? [英] How to do histograms of this row-column table in R ggplot?

查看:156
本文介绍了R ggplot中该行列表的直方图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图通过以下过程在第一行绘制描述性变量。
我也尝试过引用列/行名称失败


  1. 旋转相应数据结构的CSV数据中的行和列(高桌)



    数据 dat.m 在转置之前的结构

     'data.frame' :4 obs。 5个变量:
    $绝对:因子w / 2个等级5,7:不适用1 2
    ..- attr(*,names)= chr睡眠 REM深
    $平均值:因子w / 2等级12,7:2 1不适用NA
    ..- attr(*,names)= chr睡眠醒REM深
    $ Min:因子w / 2等级4,5:1 2不适用
    ..- attr(*,names)= chrSleepAwakeREMDeep
    $ Max:因子w / 2等级10,15:1 2不适用
    ..- attr(*, )= CHR睡眠醒来REM深
    $变化:变化睡眠醒来REM深
    绝对平均最小最大变量
    睡眠< NA> 7 4 10睡眠
    清醒< NA> 12 5 15清醒
    REM 5 NA> < NA> < NA> REM
    Deep 7< NA> < NA> < NA> Deep

    数据结构 dat.m after转换

     'data.frame':16 obs。 3个变量:
    $ Vars:chr睡眠醒来REM深...
    $变量:带有4个等级的因子Absolute,Average,..: 1 1 1 1 2 2 2 2 3 3 ...
    $值:chr不适用57...

    变量变量值
    1睡眠绝对值< NA>
    2唤醒绝对值< NA>
    3 REM绝对值5
    4深绝对值7
    5睡眠平均值7
    6唤醒值平均值12
    7 REM平均值< NA>
    8深平均值< NA>
    9睡眠分钟4
    10唤醒分钟5
    11 REM Min< NA>
    12 Deep Min< NA>
    13睡眠最大值10
    14最大唤醒值15
    15 REM最大值< NA>
    16深度最大值< NA>



    测试akash87的

     #或多个条形图
    ggplot(dat.m ,aes(x = Vars,y = value))+
    geom_bar(aes(fill = variable),stat =identity ,position =dodge)

     #由Var分隔
    ggplot(dat.m,aes(x = Vars,y = value))+ geom_bar(aes (fill = variable),stat =identity,position =dodge)+ facet_wrap(〜Vars,scales =free)



    我正在给答案增加另一个图表。

      #data 
    data< - structure(list(Vars = structure(1: 2,class =factor,.Label = c(V1,V2)),ave = c(7L,8L),ave_max = c(10L,10L),lepo = c(4L,4L)) ,.names = c(Vars,ave,ave_max,lepo),row.names = c(NA,-2L),class = c(data.table,data.frame ),sorted =Vars)
    #Melt
    library(data.table)
    mo = data.table :: melt(data,measure.vars = c(ave))
    ggplot(mo,aes(x = Vars,y = value,fill = variable,ymin = lepo,ymax = ave_max))+ geom_col()+ geom_errorbar(width = 0.2)

    这会产生:


    I am trying to plot the descriptive variables in the first row by the following procedure. I also tried unsuccessfully with quoting the column/row names

    1. rotate rows and columns in the CSV data for the correposding data structure (tall table) required in the thread A very simple histogram with R? with ggplot
    2. to plot histogram of events as Absolute variable XOR (Average, Min, Max)

      • If absolute value only, just draw absolute value in histogram.
      • If (average, min and max), just draw them in the histogram with whiskers (= whisker plot) where the limits of the whiskers are made by the min and max.

    Data

    1. initially, data.csv

      "Vars"    , "Sleep", "Awake", "REM", "Deep"
      "Absolute",        ,       , 5     , 7
      "Average" , 7      , 12    ,       ,
      "Min"     , 4      , 5     ,       , 
      "Max"     , 10     , 15    ,       ,
      

    2. data after reshaping visually

                  V1       V2       V3       V4
      Vars  Absolute Average  Min      Max     
      Sleep     <NA>        7        4       10
      Awake     <NA>       12        5       15
      REM          5     <NA>     <NA>     <NA>
      Deep         7     <NA>     <NA>     <NA>
      

    3. data after reshaping for R

       data <- structure(list(V1 = structure(c(3L, NA, NA, 1L, 2L), .Names = c("Vars", 
       "Sleep", "Awake", "REM", "Deep"), .Label = c(" 5", " 7", "Absolute"
       ), class = "factor"), V2 = structure(c(3L, 2L, 1L, NA, NA), .Names = c("Vars", 
       "Sleep", "Awake", "REM", "Deep"), .Label = c("12", " 7", "Average "
       ), class = "factor"), V3 = structure(c(3L, 1L, 2L, NA, NA), .Names = c("Vars", 
      "Sleep", "Awake", "REM", "Deep"), .Label = c(" 4", " 5", "Min     "
       ), class = "factor"), V4 = structure(c(3L, 1L, 2L, NA, NA), .Names = c("Vars", 
      "Sleep", "Awake", "REM", "Deep"), .Label = c("10", "15", "Max     "
       ), class = "factor")), .Names = c("V1", "V2", "V3", "V4"), row.names = c("Vars", 
      "Sleep", "Awake", "REM", "Deep"), class = "data.frame")
      

    R code with debugging code

    dat.m <- read.csv("data.csv")
    
    # rotate rows and columns
    dat.m <- as.data.frame(t(dat.m)) # https://stackoverflow.com/a/7342329/54964 Comment 42-
    
    library("reshape2")
    dat.m <- melt(dat.m, id.vars="Vars")
    
    ## Just plot values existing there correspondingly    
    library("ggplot2")
    # https://stackoverflow.com/a/25584792/54964
    # TODO following
    #ggplot(dat.m, aes(x = "Vars", y = value,fill=variable)) 
    

    Error

    Error: id variables not found in data: Vars
    Execution halted
    

    R: 3.3.3, 3.4.0 (backports)
    OS: Debian 8.7
    R reshape2, ggplot2, ... with sessionInfo() after loading the two packages

    Platform: x86_64-pc-linux-gnu (64-bit)
    
    locale:
     [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
     [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
     [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
     [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
     [9] LC_ADDRESS=C               LC_TELEPHONE=C            
    [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C   
    
    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     
    
    other attached packages:
    [1] ggplot2_2.1.0  reshape2_1.4.2
    
    loaded via a namespace (and not attached):
     [1] colorspace_1.3-2 scales_0.4.1     magrittr_1.5     plyr_1.8.4      
     [5] tools_3.3.3      gtable_0.2.0     Rcpp_0.12.10     stringi_1.1.5   
     [9] grid_3.3.3       stringr_1.2.0    munsell_0.4.3    
    

    Testing HaberdashPI's proposal

    Output in Fig. 1 where wrongly absolute value in Sleep and Awake. If NA, just set value to zero.

    Fig. 1 HaberdashPI's proposal output not as expected

    Data structure of dat.m before the transpose

    'data.frame':   4 obs. of  5 variables:
     $ Absolute: Factor w/ 2 levels " 5"," 7": NA NA 1 2
      ..- attr(*, "names")= chr  "Sleep" "Awake" "REM" "Deep"
     $ Average : Factor w/ 2 levels "12"," 7": 2 1 NA NA
      ..- attr(*, "names")= chr  "Sleep" "Awake" "REM" "Deep"
     $ Min     : Factor w/ 2 levels " 4"," 5": 1 2 NA NA
      ..- attr(*, "names")= chr  "Sleep" "Awake" "REM" "Deep"
     $ Max     : Factor w/ 2 levels "10","15": 1 2 NA NA
      ..- attr(*, "names")= chr  "Sleep" "Awake" "REM" "Deep"
     $ Vars    : chr  "Sleep" "Awake" "REM" "Deep"
          Absolute Average  Min      Max       Vars
    Sleep     <NA>        7        4       10 Sleep
    Awake     <NA>       12        5       15 Awake
    REM          5     <NA>     <NA>     <NA>   REM
    Deep         7     <NA>     <NA>     <NA>  Deep
    

    Data structure of dat.m after the transpose

    'data.frame':   16 obs. of  3 variables:
     $ Vars    : chr  "Sleep" "Awake" "REM" "Deep" ...
     $ variable: Factor w/ 4 levels "Absolute","Average ",..: 1 1 1 1 2 2 2 2 3 3 ...
     $ value   : chr  NA NA " 5" " 7" ...
    
        Vars variable value
    1  Sleep Absolute  <NA>
    2  Awake Absolute  <NA>
    3    REM Absolute     5
    4   Deep Absolute     7
    5  Sleep Average      7
    6  Awake Average     12
    7    REM Average   <NA>
    8   Deep Average   <NA>
    9  Sleep Min          4
    10 Awake Min          5
    11   REM Min       <NA>
    12  Deep Min       <NA>
    13 Sleep Max         10
    14 Awake Max         15
    15   REM Max       <NA>
    16  Deep Max       <NA>
    

    Testing akash87's proposal

    Code

    ds <- dat.m
    str(ds)
    ds
    ds$variable
    ds$variable %in% c("Min","Max")
    

    Wrong output because all False in the end

     $ Vars    : chr  "Sleep" "Awake" "REM" "Deep" ...
     $ variable: Factor w/ 4 levels "Absolute","Average ",..: 1 1 1 1 2 2 2 2 3 3 ...
     $ value   : chr  NA NA " 5" " 7" ...
        Vars variable value
    1  Sleep Absolute  <NA>
    2  Awake Absolute  <NA>
    3    REM Absolute     5
    4   Deep Absolute     7
    5  Sleep Average      7
    6  Awake Average     12
    7    REM Average   <NA>
    8   Deep Average   <NA>
    9  Sleep Min          4
    10 Awake Min          5
    11   REM Min       <NA>
    12  Deep Min       <NA>
    13 Sleep Max         10
    14 Awake Max         15
    15   REM Max       <NA>
    16  Deep Max       <NA>
    [1] "hello 3"
     [1] Absolute Absolute Absolute Absolute Average  Average  Average  Average 
     [9] Min      Min      Min      Min      Max      Max      Max      Max     
    Levels: Absolute Average  Min      Max     
     [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    [13] FALSE FALSE FALSE FALSE
    

    So doing ds[ds$variable %in% c("Min","Max"), ] will given False output because error-carried-forward.

    Testing Uwe's proposal

    Code with explicit data.table::dcast and two times data.table::melt. Printing out sessionInfo() just before molten <- .... Note library(ggplot2) is not loaded yet because the error comes from the line molten <- ....

    $ Rscript test111.r 
        Vars "Average" "Max" "Min" Absolute
    1: Sleep         7    10     4       NA
    2: Awake        12    15     5       NA
    3:   REM        NA    NA    NA        5
    4:  Deep        NA    NA    NA        7
    R version 3.4.0 (2017-04-21)
    Platform: x86_64-pc-linux-gnu (64-bit)
    Running under: Debian GNU/Linux 8 (jessie)
    
    Matrix products: default
    BLAS: /usr/lib/openblas-base/libblas.so.3
    LAPACK: /usr/lib/libopenblasp-r0.2.12.so
    
    locale:
     [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
     [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
     [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
     [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
     [9] LC_ADDRESS=C               LC_TELEPHONE=C            
    [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
    
    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  base     
    
    other attached packages:
    [1] data.table_1.10.4
    
    loaded via a namespace (and not attached):
    [1] compiler_3.4.0 methods_3.4.0 
    Error in melt.data.table(transposed, measure.vars = c("Absolute", "Average")) : 
      One or more values in 'measure.vars' is invalid.
    Calls: <Anonymous> -> melt.data.table
    Execution halted
    

    Testing Uwe's proposal with test code 2

    Code

    molten <- structure(list(Vars = structure(c(1L, 2L, 1L, 2L, 1L, 2L), class = "factor", .Label = c("V1", "V2")), variable = structure(c(1L, 1L, 2L, 2L, 3L, 3L), class = "factor", .Label = c("ave", "ave_max", "lepo")), value = c(7L, 8L, 10L, 10L, 4L, 4L)), .Names = c("Vars", "variable", "value"), row.names = c(NA, -6L), class = c("data.table", "data.frame"))
    
    print(molten)
    
    library(ggplot2)
    ggplot(molten, aes(x = Vars, y = value, fill = variable, ymin = lepo, ymax = ave_max)) + 
      geom_col() + geom_errorbar(width = 0.2)
    

    Output

      Vars variable value
    1   V1      ave     7
    2   V2      ave     8
    3   V1  ave_max    10
    4   V2  ave_max    10
    5   V1     lepo     4
    6   V2     lepo     4
    Error in FUN(X[[i]], ...) : object 'lepo' not found
    Calls: <Anonymous> ... by_layer -> f -> <Anonymous> -> f -> lapply -> FUN -> FUN
    Execution halted
    

    解决方案

    The problem with your code is that you used "Vars" with a quote instead of simple Vars in the ggplot aes function. Also, the header of your data set is messed up. The Absolute, Average, ... should be the column names of the data set, not the values themselves. That's why you get the error from melt function.

    Given your data set, here is my attempt:

    #Data
    data = cbind.data.frame(c("Sleep", "Awake", "REM", "Deep"),
                            c(NA, NA, 5, 7),
                            c(7, 12, NA, NA),
                            c(4, 5, NA, NA),
                            c(10, 15, NA, NA))
    colnames(data) = c("Vars", "Absolute", "Average", "Min", "Max")
    
    #reshape
    dat.m <- melt(data, id.vars="Vars")
    #Stacked plot
    ggplot(dat.m, aes(x = Vars, y = value)) + geom_bar(aes(fill=variable), stat = "identity")
    

    This will produce:

    #Or multiple bars
    ggplot(dat.m, aes(x = Vars, y = value)) + 
      geom_bar(aes(fill=variable), stat = "identity", position="dodge") 
    

    #Or separated by Vars
    ggplot(dat.m, aes(x = Vars, y = value)) + geom_bar(aes(fill=variable), stat = "identity", position="dodge") + facet_wrap( ~ Vars, scales="free")
    

    I am adding another graph to the answer. This collaborates @Uwe answer.

    #data
    data <- structure(list(Vars = structure(1:2, class = "factor", .Label = c("V1", "V2")), ave = c(7L, 8L), ave_max = c(10L, 10L), lepo = c(4L, 4L)), .Names = c("Vars", "ave", "ave_max", "lepo"), row.names = c(NA, -2L), class = c("data.table", "data.frame"), sorted = "Vars")
    #Melt
    library(data.table)
    mo = data.table::melt(data, measure.vars = c("ave"))
    ggplot(mo, aes(x = Vars, y = value, fill = variable, ymin = lepo, ymax = ave_max)) + geom_col() + geom_errorbar(width = 0.2)
    

    This will produce:

    这篇关于R ggplot中该行列表的直方图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆