在R中具有2个类别变量和1个连续变量的折线图 [英] line graph with 2 categorical variables and 1 continuous in R
问题描述
我一般对R和统计资料还是陌生的。我试图在ggplot2中的线形图中绘制2个类别变量(语音的一部分 pos,条件 trcond)和一个数字变量(得分 totacc)。
I'm quite new to R and statistics in general. I am trying to plot in a line graph 2 categorical variables (part of speech "pos", condition "trcond") and a numerical one (score "totacc") in ggplot2.
> df1<-df[, c("trcond", "subtitle", "pos", "totacc")]
> head(df1)
trcond subtitle pos totacc
7 L New Scene_16 lex 0.250
29 N New Scene_16 lex 0.500
8 L New Scene_25 lex 0.875
30 N New Scene_25 lex 0.666
9 L New Scene_29 lex 1.000
31 N New Scene_29 lex 0.833
我使用了ggplot2命令:
I have used this ggplot2 command:
>ggplot(data=summdfo, aes(x=pos, y=totacc, group=trcond, colour=trcond))
+ geom_line() + geom_point()
但这是行不通的,该图的整个地方都有彩色的点(蓝色和红色),并且连接它们的不仅仅是两条线。我想发布我得到的图表,因为我没有话要解释,但这是我的第一篇文章,我似乎无法上传图片。
But it is not working, the graph has coloured (blue and red) dots all over the place and more than just two lines linking them. I would like to post the graph I get as I lack words to explain but this is my first post and I don't seem to be able to upload pictures.
我想要获得标准的简单2线图,例如此页面中的蓝色和红色图(其中y =总帐单,x =时间(午餐,晚餐),按性别分组): http://www.cookbook-r.com/Graphs/Bar_and_line_graphs_%28ggplot2%29/
I would like to get a standard simple 2-line graph such as the blue and red ones in this page (where y=total bill, by x=time (lunch,dinner) grouped by gender): http://www.cookbook-r.com/Graphs/Bar_and_line_graphs_%28ggplot2%29/
我的数据集有可能吗?如果是这样,我的代码在做什么错?
Is this possible with my data set at all? If so, what am I doing wrong with the code?
推荐答案
在这里,我尝试根据以下示例创建的数据框您的数据。
Here I tried to create a data frame based on limited sample from your data.
df1 <- data.frame(trcond=rep(c('L', 'N'), 3),
subtitle=rep('New Scene_29', 6), # Not in use, just a dummy
pos=c('lex', 'lex', 'lex', 'noLex', 'noLex', 'noLex'),
totacc=c(0.250, 0.5, 0.875, 0.666, 1.000, 0.833))
由于pos的trcond在此数据框中不平衡,因此该图将像这样混乱:
Because trcond by pos is not balanced in this data frame, the plot is going to be jumbled up like this:
ggplot(data=df1, aes(x=pos, y=totacc, group=trcond, color=trcond))+
geom_line() +
geom_point()
但是,如果您应用一个汇总函数来计算每种条件的均值,则会显示正确的图:
However, if you apply a summary function which will compute means for each condition, a correct plot will appear:
ggplot(data=df1, aes(x=pos, y=totacc, group=trcond, color=trcond))+
geom_line(stat='summary', fun.y='mean') +
geom_point(stat='summary', fun.y='mean')
再次,这试图弄清楚数据中的内容。最好的是,您在这里使用dput(head(df1,50))提供数据样本,以便为您提供更好的答案。
Again, this is trying to figure out what's in your data. The best is that you provide here a sample of your data using dput(head(df1, 50)) to give you a better answer.
这篇关于在R中具有2个类别变量和1个连续变量的折线图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!