小提琴情节与不断的数据? [英] violin plot with constant data?

查看:185
本文介绍了小提琴情节与不断的数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

小提琴图中有一些奇怪的行为,当数据是(零件)常量时。



如果我检查常量数据并人为添加一些小错误(例如,通过添加 runif(N,min = -0.001,max = 0.001)),脚本将运行。 ,将其他小提琴曲线扭曲到垂直线(参见 1 ),而应该看起来像 2





问题:

有可能(当小提琴情节的部分数据是不变的)至

$ ul

  • 为相应的常量数据显示一条简单的水平线

  • 显示另一个小提琴剧情,好像常数据不存在?
  • / strong>

      library(ggplot2)
    library(grid)
    library(gridExtra)

    N < - 20

    test_data< - data.frame(
    idx< ; -c(1:N,1:N),
    vals< -c(runif(N,0,1),
    rep(0.5,N)),#< -R script将不会运行
    #rep(0.5,N)+ runif(N,min = -0.001,max = 0.001)),#< - 提供图形(失真)
    类型< -c rep(range,N),
    rep(const,N))


    grid.arrange(
    ggplot(test_data,aes(x = IDX,Y =瓦尔斯))+
    geom_line(AES(颜色=类型)),
    ggplot(TEST_DATA,AES(X =类型,Y =瓦尔斯))+
    geom_violin(AES (fill = type),
    position = position_dodge(width = 1))



    解决方

    为0变化组显示一条扁平线
  • 显示其他组的正常小提琴图



    在积累小组的同时,我计算出标准偏差(差异将是相同的功能)
    $ b $ pre $ library(ggplot2)
    library(gridExtra)

    N < - 20

    test_data< - data.frame()

    #范围内的随机数据
    for(grp_id in 1:2)
    {
    group_data< - data.frame(
    idx = 1:N,
    vals = runif(N,grp_id ,grp_id + 1),
    型=膏( 范围,grp_id)

    group_data $ sd_group< - SD(group_data $瓦尔斯)
    TEST_DATA = rbind(TEST_DATA ,group_data)
    }

    #常数data
    group_data = data.frame(
    idx = 1:N,
    vals = rep(0.5,N ),
    type =const

    group_data $ sd_group <-sd(group_data $ vals)

    建议我添加一个小偏移量以获得组'const'的小提琴绘图

     <$ c $如果(0 == group_data $ sd_group [1])#>添加一点jittering以获得扁平线

    {
    group_data $ vals [1] = group_data $ vals [1] + 0.00001
    }
    test_data = rbind(test_data,group_data)

    现在唯一要做的就是将所有的小提琴图放大到相同的宽度

      grid.arrange(
    ggplot(test_data,aes(x = idx))+
    geom_line(aes(y = vals,color = type)),
    ggplot(test_data,aes(x = type,y = vals,fill = type))+
    geom_violin(scale =width),
    ncol = 1


    I have some weird behaviour of violin plots, when the data is (in parts) constant.

    If I check for constant data and add some small errors artificially (e.g. by adding runif( N, min = -0.001, max = 0.001 ), the script will run. However, that distorts the other violin plot(s) to vertical line(s) (see 1), while it should look something like 2


    Question:

    Is it possible (when the partial data for a violin plot is constant) to

    • display a simple horizontal line for the respective constant data
    • display the other violin plots, as if the constant data wasn't present?


    R code:

    library(ggplot2)
    library(grid)
    library(gridExtra)
    
    N <- 20
    
    test_data <- data.frame(
      idx  <- c( 1:N, 1:N ),
      vals <- c( runif(N, 0, 1),
                 rep(  0.5, N)),                                         # <- R script won't run
                 #rep( 0.5, N) + runif( N, min = -0.001, max = 0.001 )), # <- delivers graphic (distorted)
      type <- c( rep("range",  N),
                 rep("const",  N))
    )
    
    grid.arrange(
      ggplot( test_data, aes( x = idx, y = vals)) + 
        geom_line( aes(colour = type)),
      ggplot( test_data, aes( x = type, y = vals)) + 
        geom_violin( aes( fill = type),
                     position = position_dodge(width = 1))
    )
    

    解决方案

    I finally managed to get a violin plot with some group(s) having zero variance (standard deviation)

    • to display a flat line for 0-variance groups
    • display normal violin plots for other groups

    In my example I have 3 groups of data - two without zero variance and the third is constant. While accumulating the groups, I calculate the standard deviation (variance would be same functionality)

    library(ggplot2)
    library(gridExtra)
    
    N <- 20
    
    test_data <- data.frame()
    
    # random data from range
    for( grp_id in 1:2)
    {
        group_data <- data.frame(
          idx  = 1:N,
          vals = runif(N, grp_id, grp_id + 1),
          type = paste("range", grp_id)
        )
        group_data$sd_group <- sd( group_data$vals)
        test_data = rbind( test_data, group_data)
    }
    
    # constant data
    group_data = data.frame(
        idx  = 1:N,
        vals = rep( 0.5, N),
        type = "const"
    )
    group_data$sd_group <- sd( group_data$vals)
    

    as suggested I add a little offset to obtain a violin plot for group 'const'

    # add a little jittering to get the flat line
    if( 0 == group_data$sd_group[1])
    {
        group_data$vals[1] = group_data$vals[1] + 0.00001
    }
    test_data = rbind( test_data, group_data)
    

    Only thing now left to do is to scale all violin plots to the same width

    grid.arrange(
        ggplot( test_data, aes( x = idx)) + 
            geom_line( aes( y = vals, colour = type)),
        ggplot( test_data, aes( x = type, y = vals, fill = type)) + 
            geom_violin( scale = "width"),
        ncol = 1
    )
    

    这篇关于小提琴情节与不断的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆