如何绘制“之前和之后"图形?ggplot与连接线和子集一起使用的对策? [英] How to graph "before and after" measures using ggplot with connecting lines and subsets?
问题描述
我是ggplot的新手,对R相对较新,并且想制作一个粉碎的前后"散点图,并使用连接线来说明特殊培训计划前后不同子组的百分比运动.我尝试了一些选项,但尚未:
- 分别显示每个观察值(现在相同的值是重叠的)
- 用线将相关的前后测量值(x = 0和X = 1)连接起来,以更清楚地说明变化的方向
- 使用形状和颜色沿类和id细分数据
如何最好地使用满足上述要求的ggplot(或其他)创建散点图?
主要替代方法:geom_point()
以下是一些使用genom_point的示例数据和示例代码
x<-c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1)#0 =之前,1 =之后y<-c(45,30,10,40,10,NA,30,80,80,NA,95,NA,90,NA,90,70,10,80,98,95)#的百分比和平的感觉"类<-c(0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1)#0 =多个1天= 1天id<-c(1,1,2,3,4,4,4,4,5,6,1,1,2,3,4,4,4,4,5,6)#id = per个人df<-data.frame(x,y,class,id)ggplot(df,aes(x = x,y = y),fill = id,shape = class)+ geom_point()
替代:scale_size()
我已经研究过stat_sum()来总结重叠观测的频率,但是由于重叠,因此无法使用颜色和形状进行子集化.
ggplot(df,aes(x = x,y = y))+stat_sum()
的示例图像
替代:geom_dotplot()
我还研究了geom_dotplot()来澄清因使用genom_point()而产生的重叠观察结果,就像我在下面的示例中所做的那样,但是我还没有了解如何将前后测量值合并到同一图中./p>
df1<-df [1:10,]#之前的数据df2<-df [11:20,]#之后的数据p1<-ggplot(df1,aes(x = x,y = y))+geom_dotplot(binaxis ="y",stackdir ="center",stackratio = 2,binwidth =(1/0.3))p2<-ggplot(df2,aes(x = x,y = y))+geom_dotplot(binaxis ="y",stackdir ="center",stackratio = 2,binwidth =(1/0.3))grid.arrange(p1,p2,nrow = 1)#GridExtra包
或者最好通过 x
, id
, class
汇总数据code>为 y
的 mean
/ median
,过滤掉产生 NA
的 id
s(例如 id
s 3和6),并通过线连接点?因此,如果您真的不需要显示某些 id
的可变性(如果该图仅说明趋势,则可能是正确的),您可以这样做:
库(ggplot)图书馆(dplyr)#library(ggthemes)df<-df%&%;%group_by(x,id,class)%>%总结(y =中位数(y,narm = T))%>%ungroup()%&%;%变异(id =因子(id),x = factor(x,labels = c("before","after")),class = factor(class,labels = c(一天",多天")),)%&%;%group_by(id)%&%;%mutate(nas = any(is.na(y)))%>%ungroup()%&%;%filter(!nas)%&%;%选择(-nas)ggplot(df,aes(x = x,y = y,col = id,group = id))+geom_point(aes(shape = class))+geom_line(show.legend = F)+#theme_few()+#theme(legend.position ="none")+ylab(感受和平,%")+xlab(")
I’m totally new to ggplot, relatively fresh with R and want to make a smashing "before-and-after" scatterplot with connecting lines to illustrate the movement in percentages of different subgroups before and after a special training initiative. I’ve tried some options, but have yet to:
- show each individual observation separately (now same values are overlapping)
- connect the related before and after measures (x=0 and X=1) with lines to more clearly illustrate the direction of variation
- subset the data along class and id using shape and colors
How can I best create a scatter plot using ggplot (or other) fulfilling the above demands?
Main alternative: geom_point()
Here is some sample data and example code using genom_point
x <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1) # 0=before, 1=after
y <- c(45,30,10,40,10,NA,30,80,80,NA,95,NA,90,NA,90,70,10,80,98,95) # percentage of "feelings of peace"
class <- c(0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,1) # 0=multiple days 1=one day
id <- c(1,1,2,3,4,4,4,4,5,6,1,1,2,3,4,4,4,4,5,6) # id = per individual
df <- data.frame(x,y,class,id)
ggplot(df, aes(x=x, y=y), fill=id, shape=class) + geom_point()
Alternative: scale_size()
I have explored stat_sum() to summarize the frequencies of overlapping observations, but then not being able to subset using colors and shapes due to overlap.
ggplot(df, aes(x=x, y=y)) +
stat_sum()
Alternative: geom_dotplot()
I have also explored geom_dotplot() to clarify the overlapping observations that arise from using genom_point() as I do in the example below, however I have yet to understand how to combine the before and after measures into the same plot.
df1 <- df[1:10,] # data before
df2 <- df[11:20,] # data after
p1 <- ggplot(df1, aes(x=x, y=y)) +
geom_dotplot(binaxis = "y", stackdir = "center",stackratio=2,
binwidth=(1/0.3))
p2 <- ggplot(df2, aes(x=x, y=y)) +
geom_dotplot(binaxis = "y", stackdir = "center",stackratio=2,
binwidth=(1/0.3))
grid.arrange(p1,p2, nrow=1) # GridExtra package
Or maybe it is better to summarize data by x
, id
, class
as mean
/median
of y
, filter out id
s producing NA
s (e.g. id
s 3 and 6), and connect the points by lines? So in case if you don't really need to show variability for some id
s (which could be true if the plot only illustrates tendencies) you can do it this way:
library(ggplot)
library(dplyr)
#library(ggthemes)
df <- df %>%
group_by(x, id, class) %>%
summarize(y = median(y, na.rm = T)) %>%
ungroup() %>%
mutate(
id = factor(id),
x = factor(x, labels = c("before", "after")),
class = factor(class, labels = c("one day", "multiple days")),
) %>%
group_by(id) %>%
mutate(nas = any(is.na(y))) %>%
ungroup() %>%
filter(!nas) %>%
select(-nas)
ggplot(df, aes(x = x, y = y, col = id, group = id)) +
geom_point(aes(shape = class)) +
geom_line(show.legend = F) +
#theme_few() +
#theme(legend.position = "none") +
ylab("Feelings of peace, %") +
xlab("")
这篇关于如何绘制“之前和之后"图形?ggplot与连接线和子集一起使用的对策?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!