使用 ggplot 标记/注释极值的最简洁方法? [英] Most succinct way to label/annotate extreme values with ggplot?

查看:41
本文介绍了使用 ggplot 标记/注释极值的最简洁方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 ggplot2 注释所有大于 y 阈值的 y 值.

当你plot(lm(y~x)),使用base包时,自动弹出的第二个图是Residuals vs Fitted,第三个是qqplot,第四个是Scale-地点.这些中的每一个都通过将其相应的 X 值列为相邻注释来自动标记您的极端 Y 值.我正在寻找这样的东西.

使用 ggplot2 实现这种基本默认行为的最佳方法是什么?

解决方案

Updated scale_size_area() 代替 scale_area()>

您或许可以从中获取一些东西来满足您的需求.

库(ggplot2)#一些数据df <- data.frame(x = round(runif(100), 2), y = round(runif(100), 2))m1 <- lm(y ~ x, 数据 = df)df.fortified = 强化(m1)names(df.fortified) # 包含残差和派生数量的变量的名称# 选择极值df.fortified$extreme = ifelse(abs(df.fortified$`.stdresid`) > 1.5, 1, 0)# 基于 Wickham 的 ggplot2 书中第 173 页的示例plot = ggplot(data = df.fortified, aes(x = x, y = .stdresid)) +geom_point() +geom_text(data = df.fortified[df.fortified$extreme == 1, ],aes(label = x, x = x, y = .stdresid), 大小 = 3, hjust = -.3)阴谋plot1 = ggplot(data = df.fortified, aes(x = .fitted, y = .resid)) +geom_point() + geom_smooth(se = F)plot2 = ggplot(data = df.fortified, aes(x = .fitted, y = .resid, size = .cooksd)) +geom_point() + scale_size_area("Cook 的距离") + geom_smooth(se = FALSE,show_guide = FALSE)图书馆(gridExtra)grid.arrange(plot1, plot2)

I'd like to annotate all y-values greater than a y-threshold using ggplot2.

When you plot(lm(y~x)), using the base package, the second graph that pops up automatically is Residuals vs Fitted, the third is qqplot, and the fourth is Scale-location. Each of these automatically label your extreme Y values by listing their corresponding X value as an adjacent annotation. I'm looking for something like this.

What's the best way to achieve this base-default behavior using ggplot2?

解决方案

Updated scale_size_area() in place of scale_area()

You might be able to take something from this to suit your needs.

library(ggplot2)

#Some data
df <- data.frame(x = round(runif(100), 2), y = round(runif(100), 2))

m1 <- lm(y ~ x, data = df)
df.fortified = fortify(m1)

names(df.fortified)   # Names for the variables containing residuals and derived qquantities

# Select extreme values
df.fortified$extreme = ifelse(abs(df.fortified$`.stdresid`) > 1.5, 1, 0)

# Based on examples on page 173 in Wickham's ggplot2 book
plot = ggplot(data = df.fortified, aes(x = x, y = .stdresid)) +
 geom_point() +
 geom_text(data = df.fortified[df.fortified$extreme == 1, ], 
   aes(label = x, x = x, y = .stdresid), size = 3, hjust = -.3)
plot

plot1 = ggplot(data = df.fortified, aes(x = .fitted, y = .resid)) +
   geom_point() + geom_smooth(se = F)

plot2 = ggplot(data = df.fortified, aes(x = .fitted, y = .resid, size = .cooksd)) +
   geom_point() + scale_size_area("Cook's distance") + geom_smooth(se = FALSE, show_guide = FALSE)

library(gridExtra)
grid.arrange(plot1, plot2)

这篇关于使用 ggplot 标记/注释极值的最简洁方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆