geom_density返回图而不考虑实际值 [英] geom_density returns plot without considering real values

查看：65 发布时间：2021/5/10 19:55:44 r ggplot2 density-plot

本文介绍了geom_density返回图而不考虑实际值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图在7个不同的地理区域上绘制3个变量的密度图，但是输出未如预期那样显示.N应该在中间更高，但是当它不是实数时，另一个似乎绘制相同的模式，这是为什么呢?我该如何解决?

  Variable1 <-c(rep("E"，7)，rep("N"，7)，rep("L"，7))变量2<-c(rep(1:7，3))值<-c(12.44035，11.98035333，11.40821，12.15833，13.14826，11.99339667，12.17363，4.073096，3.946134667，6.244152，5.76892，4.545772，3.580206667，2.879470667，3.6912875，3.501247，2.684179，3.06306，3.364774，4.485021333，3.373649df<-data.frame(变量1，变量2，值)图书馆ggplot(df，aes(x = Variable2，y = Variable1))+geom_density_ridges(aes(fill = Variable1))

我想要这样的东西:

解决方案

您正在计算x轴的密度，在您的情况下为 Variable 2 ，这是同一件事(1,2，...，7 )，每个 Variable 1 ，因此其密度相同.

所以我认为您希望您的x轴是 value ，并且您实际上不需要 Variable 2 ，因为它只是一个索引.

  ggplot(df，aes(x = value，y = Variable1))+geom_density_ridges(aes(fill = Variable1))

您实际想要的几何图形是 geom_line 或 geom_smooth (用于更漂亮的图形)，或者是 geom_area 用于填充曲线下的区域

现在，一种方法是将所有曲线置于相同的y比例尺上:

  ggplot(df，aes(x = Variable2，y = value，color = Variable1))+geom_smooth(fill = NA)

但是，这并没有达到您想要的分隔效果.为此，我知道的方式是为每个 Variable1 作图，然后将它们排列在一起(但是此包 ggridges 中可能有一个选项，但我从未使用过).为此，我们建立了一个基础".图:

  g = ggplot(df，aes(x = Variable2，y = value))+geom_smooth(fill = NA)+主题(axis.text.x = element_blank()，axis.title.x = element_blank())

我们在其中删除了x轴的地方仅在网格中添加了一次.然后，我们为每个变量应用该基数，一次使用一个for循环:

  for(i in unique(df $ Variable1)){df2 = df [df $ Variable1 == i，]分配(i，g％+％df2 + ylab(i)+ylim(min(df2 $ value)，max(df2 $ value)))}

这将为每个 Variable1 创建一个图形，将其命名为变量本身.现在，我们将x轴添加到最后一个绘图中，并将它们排列在一起:

  N = N +主题(axis.text.x = element_text()，axis.title.x = element_text())gridExtra :: grid.arrange(E，L，N，nrow = 3)

输出:

要使用颜色，首先我们不要将 geom 传递给 g :

  g = ggplot(df，aes(x = Variable2，y = value))+主题(axis.text.x = element_blank()，axis.title.x = element_blank())

然后我们创建将在循环中使用的颜色矢量:

  color = c(红色"，绿色"，蓝色")名称(颜色)=唯一(df $ Variable1)

然后我们将 color 参数传递到我们先前省略的 geom 中.

但是首先，让我谈谈可用的几何图形:我们可以使用一个平滑的几何图形区域，它会给出以下内容:

哪个好，但是图表下有很多无用的区域.要更改此设置，我们可以使用 geom_ribbon ，在其中可以使用参数 aes(ymin = min(value)-0.1，ymax = value)和 ylim(min(df2 $ value)-0.1，max(df2 $ value))可将图形停止在最小值(负0.1)处.问题在于，ggplot的平滑功能无法与geom_ribbon一起使用，因此我们只能选择"rough"(粗略)图:

smoot区域的代码:

  for(i in unique(df $ Variable1)){df2 = df [df $ Variable1 == i，]分配(i，g％+％df2 + ylab(i)+stat_smooth(geom ="area"，fill = color [i]))}

粗丝带的代码:

  for(i in unique(df $ Variable1)){df2 = df [df $ Variable1 == i，]分配(i，g％+％df2 + ylab(i)+ ylim(min(df2 $ value)-0.1，max(df2 $ value))+geom_ribbon(aes(ymax = value，ymin = min(value)-0.1)，fill = color [i]))}

我一直在寻找一种解决问题的方法，但是却一无所获，我会在网站上创建一个问题，如果找到解决方案，我会在这里显示！

在

I am trying to plot a density plot for 3 variables over 7 different geographical points, but the output does not show as expected. N should be higher in the middle, but the other seem to plot the same pattern when it is not real, why is this? how could I fix it?

Variable1 <- c(rep("E",7), rep("N",7),rep("L",7))
Variable2 <- c(rep(1:7, 3))
value <- c(12.44035, 11.98035333, 11.40821, 12.15833, 13.14826, 11.99339667, 12.17363, 4.073096, 3.946134667, 6.244152, 5.76892, 4.545772, 3.580206667, 2.879470667, 3.6912875, 3.501247, 2.684179, 3.06306, 3.364774, 4.485021333, 3.373649333)
df <- data.frame(Variable1, Variable2, value)

library(ggridges)
ggplot(df, aes(x = Variable2, y = Variable1)) +
  geom_density_ridges(aes(fill = Variable1))

I would like somethinng like this:

解决方案

You are calculating the density of your x-axis, which in your case is Variable 2, the same thing (1,2,...,7) for every Variable 1, so it gives the same density.

So i think that you want your x-axis to be value, and you actually don't need Variable 2 as it's a mere index.

ggplot(df, aes(x=value, y=Variable1)) +
  geom_density_ridges(aes(fill=Variable1))

EDIT 1:

The geom you want actually is geom_line, or geom_smooth (for prettier graphs), or maybe geom_area for filling the area under the curves.

Now, one way of doing it would be putting all the curves on the same y scale:

ggplot(df, aes(x=Variable2, y=value, color=Variable1)) +
  geom_smooth(fill=NA)

But this doesn't give the separation that you wanted. To do that, the way i know is making a plot for each Variable1, and arranging them together (but maybe there's an option with this package ggridges, but i never used it). To do that we build a "base" graph:

g = ggplot(df, aes(x=Variable2, y=value)) +
  geom_smooth(fill=NA) +
  theme(axis.text.x  = element_blank(),
        axis.title.x = element_blank())

Where we removed the x-axis to add only once in the grid. Then, we apply that base for each variable, one at a time, with a for loop:

for(i in unique(df$Variable1)){
  df2 = df[df$Variable1==i,]
  assign(i,
         g %+% df2 + ylab(i) +
               ylim(min(df2$value),max(df2$value)))}

This creates one graph for each Variable1, named as the variable itself. Now we add the x-axis in the last plot and arrange them together:

N = N + theme(axis.text.x  = element_text(),
              axis.title.x = element_text())

gridExtra::grid.arrange(E,L,N, nrow=3)

Output:

EDIT 2:

To use colors, first we don't pass the geom to g:

g = ggplot(df, aes(x=Variable2, y=value)) +
  theme(axis.text.x  = element_blank(),
        axis.title.x = element_blank())

Then we create a vector of colors that we'll use in the loop:

color = c("red", "green", "blue")
names(color) = unique(df$Variable1)

Then we pass the color argument inside the geom that we omitted earlier.

But first, let me talk about the available geoms: We could use a smooth geom area, which will give something like this:

Which is good but has a lot of useless area under the graphs. To change that, we can use geom_ribbon, where we can use the argument aes(ymin=min(value)-0.1, ymax=value) and ylim(min(df2$value)-0.1, max(df2$value)) to stop the graph at the minimal value (minus 0.1). The problem is that the smoothing function of ggplot doesn't work well with geom_ribbon, so we only have the option of a "rough" graph:

Code for the smoot area:

for(i in unique(df$Variable1)){
  df2 = df[df$Variable1==i,]
  assign(i,
         g %+% df2 + ylab(i) +
         stat_smooth(geom="area", fill=color[i]))}

Code for the rough ribbon:

for(i in unique(df$Variable1)){
  df2 = df[df$Variable1==i,]
  assign(i,
         g %+% df2 + ylab(i) + ylim(min(df2$value)-0.1,max(df2$value)) +
         geom_ribbon(aes(ymax=value, ymin=min(value)-0.1), fill=color[i]))}

I searched for a way to work aroud that smotthing problem but foud nothing, i'll create a question in the site and if i find a solution i'll show it here!

EDIT 3:

After asking in here, i found that using after_stat inside the aes argument of stat_smooth(geom="ribbon", aes(...)) solves it (more info read the link).

for(i in unique(df$Variable1)){
  df2 = df[df$Variable1==i,]
  assign(i,
         g %+% df2 + ylab(i) + 
           stat_smooth(geom="ribbon", fill=color[i],
                       aes(ymax=after_stat(value), ymin=after_stat(min(value))-0.1)))}

这篇关于geom_density返回图而不考虑实际值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

geom_density返回图而不考虑实际值 [英] geom_density returns plot without considering real values

问题描述

EDIT 1:

EDIT 2:

EDIT 3:

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

geom_density返回图而不考虑实际值 [英] geom_density returns plot without considering real values

问题描述

EDIT 1:

EDIT 2:

EDIT 3:

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭