在ggplot2中编辑,geom =“line” [英] edits in a ggplot2, geom = "line"

查看:191
本文介绍了在ggplot2中编辑,geom =“line”的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我还没有解决的挑战是:1)排序图中的线条,以便按照评估日期对患者线进行排序,2)用变量openCase对线条着色,最后,3)我想要删除放电点(蓝色方块) 2014年的案例(或其他随机截止日期)。



任何帮助都将被赞赏?



以下是我的示例数据,

  library(ggplot2)
library(plyr)

df < - data.frame(
date = seq(Sys.Date(),len = 156,by =5 day)[sample(156,78)],
openCase = rep(0:1,39),
patients = factor(rep(1:26,3),labels = LETTERS)


df < - ddply (df,患者,mutate,visit = order(date))
df $ visit < - as.factor(df $ visit)
levels(df $ visit)< - c(评估(1),治疗(2),放电(3))

bqplot(date,patient,data = df,geom =line)+
geom_point(aes(color = visit),size = 2,shape = 0)



我知道我的示例数据并不完美,因为一些评估数据是在处理之后,一些排放数据在评估数据之前,但是我的基础数据所面临的部分挑战是混乱的。

目前看起来像



更新2012-04-30 16:30:13 PDT

h3>

我的数据是从数据库传递来的,看起来像这样,

  DF<  - 结构(列表(日期=结构(C(15965L,15680L,16135L,15730L,
15920L,15705L,16110L,15530L,15575L,15905L,16140L,15795L,
15955L,15945L, 16205L,15675L,15525L,15830L,15625L,15725L,
15855L,15840L,15615L,15500L,15780L,15765L,15610L,15690L,
16080L,15570L,15685L,16175L,15740L ,15600L,15985L,15485L,
15605L,16115L,15535L,15755L,16145L,16040L,15970L,16000L,
16075L,15995L,16010L,15990L,15665L,15895L,15865L,16120L,
15880L,15930L,16055L,15820L,15650L,16155L,15700L,15640L,
15505L,15750L,15800L,15775L,15825L,15635L,16150L,15860L,
16100L,15475L,16050L,15785L,15495L ,15810L,15805L,15490L,
15460L,16085L),class =Date),openCase = c(0L,0L,0L,1L,
1L,1L,0L,0L,0L,1L ,1L,1L,0L,0L,0L,1L,1L,1L,0L,0L,
0L,1L,1L,1L,0L,0L,0L,1L,1L,1L,0L,0L,0L 1L,1L,1L,1L,1L,1L,1L,1L,0L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L, ,0L,0L,0L,1L,1L,1L,0L,0L,0L,1L,1L,1L,0L,0L,1L,1L,1L,0L,0L,1L, ,1L),患者=结构(c(1L,
1L,1L,2L,2L,2L,3L,3L,3L,4L,4L,4L,5L,5L,5L,6L,6L,
6L,7L,7L,7L,8L,8L,8L,9L,9L,9L,10L,10L,10L,11L,11L,
11L,12L,12L,12L,13L,13L,13L, 14L,14L,14L,15L, 15L,15L,
16L,16L,16L,17L,17L,17L,18L,18L,18L,19L,19L,19L,20L,
20L,20L,21L,21L,21L,22L, 22L,22L,23L,23L,23L,24L,24L,
24L,25L,25L,25L,26L,26L,26L),标签= c(A,B
D,E,F,G,H,I,J,K,L,M,N,O ,P,
,Q,R,S,T,U,V,W,X,Y,Z) (2L,1L,3L,3L,1L,2L,2L,3L,1L,3L,
1L,2L,2L,1L, 3L,2L,1L,3L,1L,2L,3L,3L,2L,1L,3L,
2L,1L,3L,1L,2L,1L,3L,2L,3L, 1L,2L,1L,
3L,2L,1L,2L,3L,3L,1L,2L,1L,3L,2L,2L,3L,1L,3L,
2L,1L, 2L,1L,1L,2L,3L,3L,1L,2L,2L,3L,1L,1L,
3L,2L,1L,3L,2L,2L,1L,3L) zym,xov,poi
),class =factor)),.Names = c(date,openCase,patients,
visit) ,row.names = c(NA,-78L),class =data.frame)

中访问级别的数量和具体的标签,很可能会改变,所以我想要某种类型的代码,其中我 rank sort ,根据我现有的数据生成新变量( visit )。 / div>

我仍然不确定我明白@ Ben的回答有什么问题,但我会尝试添加一个我自己的答案。从编辑中给出的 df 开始。
$ b 创建一个新变量访问 code>(注意大写字母V),它是根据给定日期的顺序进行的评估/处理/排放。这是@ Ben的代码,只需重新编写。

  df < -  ddply(df,patients,mutate,
访问=因子(等级(日期),
等级= 1:3,
标签= c(评估(1),治疗(2),出院(3) )))

我不明白这与访问有什么关系列中的数据;实际上,原来的访问列在此后不再使用:

 > ;表(df $ Visit,df $ visit)

zym xov poi
评估(1)16 7 3
治疗(2)3 16 7
出院(3) )7 3 16

对患者进行重新排序(再次复制Ben):

  df $ patients < -  reorder(df $ patients,df $ date,function(x)min(as.numeric(x)))

确定应该显示的点的子集(与Ben相同但不同的代码)

  df2 < -  df [!((df $ Visit ==Discharge(3))&(df $ date> as.Date (2014-01-01))),] 

(df,aes(date,patients))+ $ b $在不影响图例的情况下制作线条不同颜色的方式

  ggplot b geom_blank()+ 
geom_line(data = df [df $ openCase == 0,],color =black)+
geom_line(data = df [df $ openCase == 1,], color =red)+
geom_point(data = df2,aes(color = Visit), size = 2,shape = 0)


I have a line plot of some event at a hospital that I have been struggling with.

The challenges that I haven't solved yet are, 1) sorting the lines on the plot so that the patient-lines are sorted by Assessment-date, 2) coloring the lines by the variable 'openCase' and finally, 3) I would like to remove the Discharge-point (the blue square) for the cases that are in the year 2014 (or at some other random cut of date).

Any help would be appreciated?

Here is my sample data,

library(ggplot2)
library(plyr)

df <- data.frame(
 date = seq(Sys.Date(), len= 156, by="5 day")[sample(156, 78)],
 openCase = rep(0:1, 39),
 patients = factor(rep(1:26, 3), labels = LETTERS)
)

df <- ddply(df, "patients", mutate, visit = order(date))
df$visit <- as.factor(df$visit)
levels(df$visit) <- c("Assessment (1)", "Treatment (2)", "Discharge (3)")

qplot(date, patients, data = df, geom = "line") + 
geom_point(aes(colour = visit), size = 2, shape=0)

I'm aware that my example data is not perfect as some of the assessment datas is after the treatments and some of the discharge data is before the assessments data, but that part of the challenge that my base data is messed up.

What it looks like at the moment,

Update 2012-04-30 16:30:13 PDT

My data is delivered from a database and looks something like this,

df <- structure(list(date = structure(c(15965L, 15680L, 16135L, 15730L, 
15920L, 15705L, 16110L, 15530L, 15575L, 15905L, 16140L, 15795L, 
15955L, 15945L, 16205L, 15675L, 15525L, 15830L, 15625L, 15725L, 
15855L, 15840L, 15615L, 15500L, 15780L, 15765L, 15610L, 15690L, 
16080L, 15570L, 15685L, 16175L, 15740L, 15600L, 15985L, 15485L, 
15605L, 16115L, 15535L, 15755L, 16145L, 16040L, 15970L, 16000L, 
16075L, 15995L, 16010L, 15990L, 15665L, 15895L, 15865L, 16120L, 
15880L, 15930L, 16055L, 15820L, 15650L, 16155L, 15700L, 15640L, 
15505L, 15750L, 15800L, 15775L, 15825L, 15635L, 16150L, 15860L, 
16100L, 15475L, 16050L, 15785L, 15495L, 15810L, 15805L, 15490L, 
15460L, 16085L), class = "Date"), openCase = c(0L, 0L, 0L, 1L, 
1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 
0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 
0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 
1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 
0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L), patients = structure(c(1L, 
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 
6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 10L, 10L, 11L, 11L, 
11L, 12L, 12L, 12L, 13L, 13L, 13L, 14L, 14L, 14L, 15L, 15L, 15L, 
16L, 16L, 16L, 17L, 17L, 17L, 18L, 18L, 18L, 19L, 19L, 19L, 20L, 
20L, 20L, 21L, 21L, 21L, 22L, 22L, 22L, 23L, 23L, 23L, 24L, 24L, 
24L, 25L, 25L, 25L, 26L, 26L, 26L), .Label = c("A", "B", "C", 
"D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", 
"Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z"), class = "factor"), 
    visit = structure(c(2L, 1L, 3L, 3L, 1L, 2L, 2L, 3L, 1L, 3L, 
    1L, 2L, 2L, 1L, 3L, 2L, 1L, 3L, 1L, 2L, 3L, 3L, 2L, 1L, 3L, 
    2L, 1L, 3L, 1L, 2L, 1L, 3L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 1L, 
    3L, 2L, 1L, 2L, 3L, 3L, 1L, 2L, 1L, 3L, 2L, 2L, 3L, 1L, 3L, 
    2L, 1L, 3L, 2L, 1L, 1L, 2L, 3L, 3L, 1L, 2L, 2L, 3L, 1L, 1L, 
    3L, 2L, 1L, 3L, 2L, 2L, 1L, 3L), .Label = c("zym", "xov", "poi"
    ), class = "factor")), .Names = c("date", "openCase", "patients", 
"visit"), row.names = c(NA, -78L), class = "data.frame")

The number of levels in visit, and specific labeling, will most likely change so I would like some kind of code where I rank or sort based on my existing data instead (visit) of generating new variables.

解决方案

I'm still not sure I understand what is wrong with @Ben's answer, but I'll try adding one of my own. Starting with the df given in the edit.

Create a new variable Visit (note the capital V) which is Assessment/Treatment/Discharge based on the ordering of the dates given. This is @Ben's code, just re-written.

df <- ddply(df, "patients", mutate, 
  Visit = factor(rank(date),
                 levels = 1:3,
                 labels=c("Assessment (1)", "Treatment (2)", "Discharge (3)")))

I don't understand how this relates to the visit column in the data originally; in fact, the original visit column is not used hereafter:

> table(df$Visit, df$visit)

                 zym xov poi
  Assessment (1)  16   7   3
  Treatment (2)    3  16   7
  Discharge (3)    7   3  16

Reorder the patients (again copying Ben):

df$patients <- reorder(df$patients,df$date,function(x) min(as.numeric(x)))

Determine the subset of points that should be shown (same idea as Ben, but different code)

df2 <- df[!((df$Visit == "Discharge (3)") & (df$date > as.Date("2014-01-01"))),]

To add something new, here is a way to make the lines different colors without impacting the legend

ggplot(df, aes(date, patients)) +
    geom_blank() +
    geom_line(data = df[df$openCase == 0,], colour = "black") +
    geom_line(data = df[df$openCase == 1,], colour = "red") +
    geom_point(data = df2, aes(colour = Visit), size = 2, shape = 0)

这篇关于在ggplot2中编辑,geom =“line”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆