三元图-跨组缩放不透明度 [英] Ternary plot - scaling opacity across groups

查看:159
本文介绍了三元图-跨组缩放不透明度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试绘制等高线三元图,分为两组, 等高线的不透明度(α)是 点的密度(例如,更多的点紧密聚集=不透明的程度较小).

I am trying to make a contour ternary plot, with two groups, where the opacity (alpha) of the contours is a function of the density of the points (e.g. more points tightly clustered = less opaque).

我只停留在一点上. 我的两个组(此处为AB)的组大小不相等(A = 150 obs,B = 50 obs), 这意味着一组中的点通常会更多地聚集在一起, 在这种情况下,组B的不透明度应相对于 组A,因为组B中的点密度较小.但看起来不透明度在组内而不是组间扩展.

I am stuck on one point. My two groups (here A and B) have unequal group sizes (A = 150 obs, B = 50 obs), this means points in one group are frequently alot more clustered, when this is the case the opacity of group B should be much higher relative to group A, since points in group B are much less dense. But it looks like opacity is scaled within groups instead of across groups.

我的问题:是否可以将不透明度缩放为点的密度,点的密度在两组之间都是相对的?

My Question: is it possible to scale opacity to the density of points, where density is relative across both groups?

一个例子:

library(ggtern)

set.seed(1234)

# example data
df <- data.frame(X = c(runif(150, 0.7, 1),runif(50, 0, 0.3)),
                 Y = c(runif(150, 0, 0.3),runif(50, 0, 0.3)),
                 Z = c(runif(150, 0, 0.5),runif(50, 0.5, 1)),
                 D = c(rep("A", 150), rep("B", 50)))


# ternary plot 
ggtern(df, aes(x = X,y = Y, z = Z, color = D)) +
  stat_density_tern(aes(alpha = ..level.., fill = D), 
                    geom = 'polygon', 
                    bins = 10,
                    color = "grey") +
  geom_point(alpha = 0.5) +
  scale_colour_manual(values = c("tomato3", "turquoise4"))

# points are only displayed to show densities, I don't plan on showing 
# points in the final plot

给定组B点的密度要小得多,我希望轮廓 比A组更不透明.

Given group B points are much less dense I would expect the contours to be more opaque than group A.

另一个选择是使用scale_colour_gradient(),但是我不能 看看如何获​​得两个单独的渐变(每个AB一个) 单情节.

Another option would be to use scale_colour_gradient(), but I can't see how to get two separate gradients (one for each of A and B) on a single plot.

推荐答案

我希望我能为您提供一个更简单的答案,但是,我没有.但是,通过创建新的统计信息并预先定义中断,我发现了一个非常棘手的解决方案.免责声明:我自己不使用ggtern,所以我对细节不了解.通常的问题似乎是密度是按组计算的,而密度的积分通常设置为1.我们可以通过添加一个新的统计数据来解决此问题.

I wished I would have an easier answer for you, but alas, I have not. However, I've found a quite hacky solution to your problem, by making a new stat and predefining breaks. Disclaimer: I don't use ggtern myself, so I don't know much about the specifics. The problem in general seems to be that the density is computed per group and the integral of densities are generally set to 1. We can solve this by adding a new stat that scales this for us.

于是,该解决方案似乎非常简单:将计算出的密度乘以组中数据点的数量,即可得到按比例缩放的密度,以反映组的大小.唯一的缺点是,我们必须更改按组计算的bins = 10,并使用breaks = seq(start, end, by = somenumber)更改轮廓的绝对值而不是相对值.

The solution then seems deceivingly simple: multiply the calculated densities by the number of datapoints in the group, to get a density scaled to reflects group sizes. The only drawback would be that we have to change bins = 10, which is calculated per group, with breaks = seq(start, end, by = somenumber) to have absolute instead of relative breaks for the contours.

但是,ggtern相当复杂,其自身的特性使得编写新的stat函数很难工作.存在一个带有批准的统计信息"的列表,ggtern将删除所有未经批准的图层.

However, ggtern is quite the complicated package with it's own peculiarities that make it difficult to write a new stat function to work. There exists a list with 'approved stats' and ggtern will remove any layers that don't have their approval.

ggtern:::.approvedstat

             identity            confidence          density_tern           smooth_tern 
       "StatIdentity"  "StatConfidenceTern"     "StatDensityTern"      "StatSmoothTern" 
                  sum                unique      interpolate_tern          mean_ellipse 
            "StatSum"          "StatUnique" "StatInterpolateTern"     "StatMeanEllipse" 
             hex_tern              tri_tern
        "StatHexTern"         "StatTriTern"

因此,首要任务是将我们自己的统计信息(我们称为StatDensityTern2)添加到批准的统计信息列表中,但是由于此.approvedstat位于软件包名称空间中,因此我们将要做到这一点,一定要很狡猾:

So the first order of business would be to add an entry for our own stat (which we'll call StatDensityTern2) to the approved stat list, but since this .approvedstat is in the package namespace, we'll have to be a bit hacky to do this:

approveupdate <- c(ggtern:::.approvedstat, "density_tern2" = "StatDensityTern2")
assignInNamespace(".approvedstat", approveupdate, pos = "package:ggtern")

现在,我们可以编写自己的StatDensityTern2,该继承自StatDensityTern的功能,并且对组的计算方式进行小的更新.在编写此新统计信息时,我们需要注意加载必要的程序包并正确引用内部函数.我们将主要从现有的StatDensityTern$compute_group复制粘贴,但是在将数据传递到轮廓函数之前进行一些小的调整以将z = as.vector(dens$z)更改为z = as.vector(dens$z) * nrow(data).

Now we can write our own StatDensityTern2, that inherits functionality from StatDensityTern, with a small update in how groups are computed. While writing this new stat, we need to take care that we load necessary packages and refer to internal functions correctly. We'll largely copy-paste from the existing StatDensityTern$compute_group, but make a small adjustment to change z = as.vector(dens$z) to z = as.vector(dens$z) * nrow(data) before passing on the data to the contour function.

library(compositions)
library(rlang)

StatDensityTern2 <-
  ggproto(
    "StatDensityTern2",
    StatDensityTern,
    compute_group = function(
      self, data, scales, na.rm = FALSE, n = 100, h = NULL,
      bdl = 0, bdl.val = NA, contour = TRUE, base = "ilr", expand = 0.5,
      weight = NULL, bins = NULL, binwidth = NULL, breaks = NULL
    ) {
      if (!c(base) %in% c("identity", "ilr")) 
        stop("base must be either identity or ilr", call. = FALSE)
      raes = self$required_aes
      data[raes] = suppressWarnings(compositions::acomp(data[raes]))
      data[raes][data[raes] <= bdl] = bdl.val[1]
      data = remove_missing(data, vars = self$required_aes, na.rm = na.rm, 
                            name = "StatDensityTern", finite = TRUE)
      if (ggplot2:::empty(data)) 
        return(data.frame())
      coord = coord_tern()
      f = get(base, mode = "function")
      fInv = get(sprintf("%sInv", base), mode = "function")
      if (base == "identity") 
        data = tlr2xy(data, coord, inverse = FALSE, scale = TRUE)
      h = h %||% ggtern:::estimateBandwidth(base, data[which(colnames(data) %in% 
                                                      raes)])
      if (length(h) != 2) 
        h = rep(h[1], 2)
      if (base != "identity" && diff(h) != 0) 
        warning("bandwidth 'h' has different x and y bandwiths for 'ilr', this may (probably will) introduce permutational artifacts depending on the ordering", 
                call. = FALSE)
      data[raes[1:2]] = suppressWarnings(f(as.matrix(data[which(colnames(data) %in% 
                                                                  raes)])))
      expand = if (length(expand) != 2) 
        rep(expand[1], 2)
      else expand
      rngxy = range(c(data$x, data$y))
      rngx = scales:::expand_range(switch(base, identity = coord$limits$x, 
                                 rngxy), expand[1])
      rngy = scales:::expand_range(switch(base, identity = coord$limits$y, 
                                 rngxy), expand[2])
      dens = ggtern:::kde2d.weighted(data$x, data$y, h = h, n = n, lims = c(rngx, 
                                                                   rngy), w = data$weight)

# Here be relevant changes ------------------------------------------------


      df = data.frame(expand.grid(x = dens$x, y = dens$y), 
                      z = as.vector(dens$z) * nrow(data), 
                      group = data$group[1])

# Here end relevant changes -----------------------------------------------


      if (contour) {
        df = StatContour$compute_panel(df, scales, bins = bins, 
                                       binwidth = binwidth, breaks = breaks)
      }
      else {
        names(df) <- c("x", "y", "density", "group")
        df$level <- 1
        df$piece <- 1
      }
      if (base == "identity") 
        df = tlr2xy(df, coord, inverse = TRUE, scale = TRUE)
      df[raes] = suppressWarnings(fInv(as.matrix(df[which(colnames(df) %in% 
                                                            raes)])))
      df
    }
  )

现在我们已经编写了一个新的统计信息并已经批准了该统计信息,我们可以按以下方式使用它:

Now that we've written a new stat and have approved of the stat ourselves, we can use it in the following manner:

set.seed(1234)

# example data
df <- data.frame(X = c(runif(150, 0.7, 1),runif(50, 0, 0.3)),
                 Y = c(runif(150, 0, 0.3),runif(50, 0, 0.3)),
                 Z = c(runif(150, 0, 0.5),runif(50, 0.5, 1)),
                 D = c(rep("A", 150), rep("B", 50)))

ggtern(df, aes(x = X, y = Y, z = Z, color = D)) +
  geom_polygon(aes(alpha = ..level.., fill = D),
               stat = "DensityTern2",
               breaks = seq(10, 150, by = 10),
               color = "grey") +
  geom_point(alpha = 0.5) +
  scale_colour_manual(values = c("tomato3", "turquoise4"))

哪个给了我以下情节:

希望您发现这很有用!

这篇关于三元图-跨组缩放不透明度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆