如何在ggplot中使用相关数据框内部或外部的变量在“飞行中”转换美学? [英] How can I transform aesthetics 'on the fly' in ggplot using variables inside or outside the relevant dataframe?

查看:135
本文介绍了如何在ggplot中使用相关数据框内部或外部的变量在“飞行中”转换美学?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在心理学中,显示具有重叠正态曲线的直方图是很常见的。用geom_line显示观测值的密度将有助于与正常曲线进行比较,所以我编写了另一个直方图函数,它可以在中执行此操作( powerHist userfriendlyscience 包)。然而,对于大型矢量(目前与1670万个数据点一起工作),它执行速度非常缓慢,所以我试图让它更快。我曾经使用 density 来手动计算密度估计值,然后将它们与箱体中最大数据点数相乘以缩放它以匹配直方图。



但是这很慢,再加上我想ggplot2应该可以做到这一点。其中一个由 stat_density 计算的变量是 .. scaled .. ,这是密度估计缩放到现在我只需要乘以这个。但ggplot2不会找到我使用的变量。将它与一个常量相乘可以正常工作,但是否将变量放在我传递给ggplot2的数据框中似乎并不重要:ggplot2找不到它。

  scalingFactor<  -  max(table(cut(mtcars $ mpg,breaks = 20))); 
dat< - data.frame(mpg = mtcars $ mpg,
scalingFactor = scalingFactor);
ggplot(mtcars,aes(x = mpg))+
geom_histogram(bins = 20)+
geom_line(aes(y = .. scaled .. * scalingFactor),
stat ='density',color ='red');

这产生:

  eval中的错误(expr,envir,enclos):object'scalingFactor'not found 

用常规数字替换 scalingFactor 时,它有效:

  ggplot(mtcars,aes(x = mpg))+ 
geom_histogram(bins = 20)+
geom_line(aes(y = .. scaled .. * 10),
stat ='密度,颜色='红色');



另外,当仅仅使用 scalingFactor 时,它也可以工作:

<$ p $ (bs = 20)+
geom_line(aes(y = scalingFactor),
stat(mtcars,aes(x = mpg))+
geom_histogram 'density',color ='red');



因此 scalingFactor 似乎可用;乘法是可用的;并清楚地显示 .. scaled .. 可用。但是,将它们结合起来似乎失败了。我在这里错过了什么?我无法找到任何有关'通过stat生成的变量或其他东西计算'的内容。 。 。



有没有人遇到过这个?它是否已知ggplot2行为,我只是错过了?

解决方案

尝试与 aes_q(y = bquote(。 .scaled .. *。(scalingFactor)))



(尽管我认为在某处有一个bug, ggplot表明这不应该是需要的,事实上在处理不是来自stat的变量时是不需要的)


In psychology, it's common to display histograms with an overlaying normal curve. Also showing the density of the observed values with geom_line would facilitate comparison to the normal curve, so I wrote another histogram function that does this (powerHist in the userfriendlyscience package). However, it performs very slowly for large vectors (currently working with 16.7 million datapoints), so I'm trying to make it faster. I used to use density to manually compute the density estimates, and then multiply them with maximum number of datapoints in a bin to scale it to match the histogram.

But this is very slow, plus, I figured ggplot2 should be able to do this. One of the variables computed by stat_density is ..scaled.., which is the density estimate scaled to a max of 1. Now I just need to multiply this. But ggplot2 won't find the variable I use. Multiplying it with a constant works fine, but whether I place the variable in the dataframe I pass on to ggplot2 or not doesn't seem to matter: ggplot2 can't find it.

scalingFactor <- max(table(cut(mtcars$mpg, breaks=20)));
dat <- data.frame(mpg = mtcars$mpg,
                  scalingFactor = scalingFactor);
ggplot(mtcars, aes(x=mpg)) +
  geom_histogram(bins=20) +
  geom_line(aes(y=..scaled.. * scalingFactor),
            stat='density', color='red');

This yields:

Error in eval(expr, envir, enclos) : object 'scalingFactor' not found

When replacing the scalingFactor with a regular number, it works:

ggplot(mtcars, aes(x=mpg)) +
  geom_histogram(bins=20) +
  geom_line(aes(y=..scaled.. * 10),
            stat='density', color='red');

Also, when just using scalingFactor on its own, it also works:

ggplot(mtcars, aes(x=mpg)) +
  geom_histogram(bins=20) +
  geom_line(aes(y=scalingFactor ),
            stat='density', color='red');

So scalingFactor seems available; multiplication is available; and clearly ..scaled.. is available. Still, combining them seems to fail. What am I missing here? I can't find anything on 'computation with variables generated by stat' or something . . .

Has anybody run into this before? Is it known ggplot2 behavior that I just missed?

解决方案

try with aes_q(y=bquote(..scaled.. * .(scalingFactor)))

(although I would think there is a bug somewhere, since the environment argument in ?ggplot suggests this shouldn't be needed, and in fact isn't needed when dealing with variables that don't come from a stat)

这篇关于如何在ggplot中使用相关数据框内部或外部的变量在“飞行中”转换美学?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆