带NA值的ggplot线图 [英] ggplot line graph with NA values
问题描述
我在ggplot上遇到麻烦,试图在同一张图上绘制2个不完整的时间序列,其中y数据在x轴(年)上没有相同的值-因此在某些年份中存在NA: /p>
I'm having with trouble with ggplot trying to plot 2 incomplete time series on the same graph where the y data does not have the same values on the x-axis (year) - NAs are thus present for certain years :
test<-structure(list(YEAR = c(1937, 1938, 1942, 1943, 1947, 1948, 1952,
1953, 1957, 1958, 1962, 1963, 1967, 1968, 1972, 1973, 1977, 1978,
1982, 1983, 1986.5, 1987, 1993.5), A1 = c(NA, 24, NA, 32, 32,
NA, 34, NA, NA, 18, 12, NA, 10, NA, 11, NA, 15, NA, 24, NA, NA,
25, 26), A2 = c(40, NA, 38, NA, 25, NA, 26, NA, 20, NA, 17,
17, 17, NA, 16, 18, 21, 18, 17, 25, NA, NA, 26)), .Names = c("YEAR", "A1",
"A2"), row.names = c(NA, -23L), class = "data.frame")
我尝试的以下代码输出不连贯的混乱:
The following code I tried outputs a disjointed mess :
ggplot(test, aes(x=YEAR)) +
geom_line(aes(y = A1), size=0.43, colour="red") +
geom_line(aes(y = A2), size=0.43, colour="green") +
xlab("Year") + ylab("Percent") +
scale_x_continuous(limits=c(1935, 1995), breaks = seq(1935, 1995, 5),
expand = c(0, 0)) +
scale_y_continuous(limits=c(0,50), breaks=seq(0, 50, 10), expand = c(0, 0))
我该如何解决这个问题?
How can I solve this problem?
推荐答案
我的首选解决方案是将其重塑为长格式.然后,您只需要1个geom_line
调用.特别是如果您有很多系列,那会比较整齐.与LyzandeR的第二张图表相同的结果.
My preferred solution would be to reshape this to long format. Then you only need 1 geom_line
call. Especially if you have many series, that's tidier. Same result as LyzandeR's 2nd chart.
library(ggplot2)
library(reshape2)
test2 <- melt(test, id.var='YEAR')
test2 <- na.omit(test2)
ggplot(test2, aes(x=YEAR, y=value, color=variable)) +
geom_line() +
scale_color_manual(values=c('red', 'green')) +
xlab("Year") + ylab("Percent") +
scale_x_continuous(limits=c(1935, 1995), breaks = seq(1935, 1995, 5),
expand = c(0, 0)) +
scale_y_continuous(limits=c(0,50), breaks=seq(0, 50, 10), expand = c(0, 0))
您可能还考虑在该行之外添加一个geom_point()
调用,因此很明显哪些点是实数值,哪些点缺失.长格式的另一个优点是,额外的几何图形每个仅需要调用1个,而不是每个系列1个.
You might consider adding a geom_point()
call in addition to the line, so it's clear which points are real values and which are missing. Another advantage to the long format is that additional geoms take just 1 call each, as opposed to 1 per series each.
这篇关于带NA值的ggplot线图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!