2-y 轴图,计算绝对值 &R 的相对变化 [英] 2-y Axes Plot with calculated absolute & relative Change in R
问题描述
我试图在 R 中计算两件事,Relative &绝对改变, &绘制2-y 轴散点图.我仍在寻求输入来为此类数据创建 2 y 轴图.
set.seed(123)df=expand.grid(PatientID=1:3,time=1:3,Group=1:2)数据 <- data.table(df,Outcome=as.integer(runif(9)*100))
数据格式 df #sample
PatientID 时间 结果组1 1 87 11 2 32 11 3 76 22 1 21 22 2 23 32 3 23 3## 直到 200 PatientID 或志愿者,并且有许多结果测量列 (33:290)
患者 ID、时间、结果、组分别表示志愿者的识别号、就诊时间、感兴趣的结果度量和组(无论他们属于条件 A 还是条件 B).数据包括参与者和两组的 3 次访问.
- Relative Change(%),即将第 1 组和第 1 组的绝对变化表示为相对于基线时间点的结果的百分比.2.
[(F - B )/B]*100, 这里 B 和 F 是结果测量的基线和后续值
绝对变化,即 F-B
2-y 轴散点图:
此图的主要目的是查看结果测量相对于基线(时间 = 1)的变化,并确定是否存在组差异.将各自的相对/绝对变化值作为 y1 和 y2 包含在图中是明智的.
我在 ggplot2 和 ggvis 中制作了几个散点图来查看趋势,但是我没有找到计算(& plot)相对 & 的直接选项.通过 ggplot2 & 的绝对变化ggvis 软件包.我真的建议像我这样的新手用户使用它们.此外,我还计划将相关的 &一个散点图本身对于一个结果测量的绝对变化值,即 2-y 轴图.
如果您需要更多说明,请告诉我.谢谢,期待!
回答 1 &2 问题 #认为它可能会帮助他人
这就是我最终做到的:
库(dplyr)dft1= 过滤器(df,df$time==1)dft2= 过滤器(df,df$time==2)dft3= filter(df, df$time==3)
计算从第二个时间点到第一个时间点的绝对变化 &第三至第一时间点:abs1=dft2[33:290] - dft1[33:290]
abs2=dft3[33:290] - dft1[33:290]
计算从第二个时间点到第一个时间点的相对变化 &第三至第一时间点:rel1=abs1/dft1[33:290]*100
rel2=abs2/dft1[33:290]*100
我会将绝对变化和相对变化放在不同的 y 轴轴上.这个链接很容易让我开始:(
I am trying to calculate two things in R, Relative & Absolute Change, & plot 2-y axis scatter plots. I am still seeking inputs on creating a 2-y axis plot for this type of data.
set.seed(123)
df=expand.grid(PatientID=1:3,time=1:3, Group=1:2)
dat <- data.table(df,Outcome=as.integer(runif(9)*100))
Data Format df #sample
PatientID time Outcome Group
1 1 87 1
1 2 32 1
1 3 76 2
2 1 21 2
2 2 23 3
2 3 23 3
## Cont until 200 PatientID or volunteers and there are many outcome measure columns (33:290)
PatientID, time, Outcome, Group denote volunteers' identification number, time of visiting a hospital, outcome measure of interest and Group (whether they belong to a condition A or B) respectively. Data includes 3 visits by participants and two groups.
- Relative Change(%), i.e. expresses the absolute change as a percentage of the outcome from baseline time point, for Group 1 & 2.
[(F - B )/ B]*100, here B and F are baseline and follow up values of a outcome measure
Absolute Change, i.e. F-B
2-y axes scatter plots:
The prime purpose of this plot is to look at the changes in outcome measures with respect to baseline (time=1), and also determine if there are group differences. It is prudent to include respective relative/absolute change values in the plot as y1 and y2.
I had made several scatterplots in ggplot2 and ggvis to view the trends, but I did not find a direct option to calculate (& plot) relative & absolute change through the ggplot2 & ggvis packages. I really recommend using them for novice users, like myself. In addition, I am also planning to incorporate relative & absolute change values in one scatterplot itself for one outcome measure, i.e. 2-y axes plots.
Let me know if you require some more clarifications. Thanks, and looking forward!
Answers for 1 & 2 Ques #thought it might help others
This is how I finally did it:
library(dplyr)
dft1= filter(df, df$time==1)
dft2= filter(df, df$time==2)
dft3= filter(df, df$time==3)
To calculate absolute change from second to first time point & third to first time point:
abs1=dft2[33:290] - dft1[33:290]
abs2=dft3[33:290] - dft1[33:290]
To calculate relative change from second to first time point & third to first time point:
rel1=abs1/dft1[33:290]*100
rel2=abs2/dft1[33:290]*100
I will put absolute change and relative change on different y-axis axes. This link was handy to get me started: (How can I plot with 2 different y-axes?).
Nice resource for learning R: https://stackoverflow.com/tags/r/info
Not clear exactly what you mean but you should be able to modify this code to achieve your purpose:
dat = data.table(PatientID=c(1,2), time=c(1:3), Outcome=c(87, 32,76,21,24, 27))
#Modified so you can actually compare across 2 time periods
#Note your data is already sorted, but to be on the safe side:
setkey(dat,PatientID,time)
dat[, `:=`(rel.change.1 = 100 * (Outcome - shift(Outcome)) / Outcome,
rel.change.2 = 100 * (Outcome - shift(Outcome, 2)) / Outcome,
abs.change.1 = Outcome - shift(Outcome),
abs.change.2 = Outcome - shift(Outcome, 2)),
by = PatientID]
The key idea is to use shift
to get a shift of the Outcome
column; the second argument to shift is the number of rows by which to shift it. Combined with grouping by PatientID
, and given that we keyed the data.table
in order to ensure it was sorted by time
within groups of PatientID
, this ensures the correct comparison. (Note, if your actual data is not complete, this will not produce correct results. For example, if you have observations at times 1 and 4 for PatientID=1 but 2 and 3 for PatientID = 2, then both 1-shifts will compare these observations even though they are not the same number of time units apart. If this is the case you should use CJ
on the ID and time columns to get rows in which you fill NAs
for all the missing observations; that will ensure that the shifts reflect the correct time differences.)
This produces:
> dat
PatientID time Outcome rel.change.1 rel.change.2 abs.change.1 abs.change.2
1: 1 1 87 NA NA NA NA
2: 1 2 24 -262.50000 NA -63 NA
3: 1 3 76 68.42105 -14.47368 52 -11
4: 2 1 21 NA NA NA NA
5: 2 2 32 34.37500 NA 11 NA
6: 2 3 27 -18.51852 22.22222 -5 6
Now, we can melt,
melted <- melt(dat,id.vars=c("PatientID","time"),variable.factor=F)
> melted
PatientID time variable value
1: 1 1 Outcome 87.00000
2: 1 2 Outcome 24.00000
3: 1 3 Outcome 76.00000
4: 2 1 Outcome 21.00000
5: 2 2 Outcome 32.00000
6: 2 3 Outcome 27.00000
7: 1 1 rel.change.1 NA
8: 1 2 rel.change.1 -262.50000
9: 1 3 rel.change.1 68.42105
10: 2 1 rel.change.1 NA
11: 2 2 rel.change.1 34.37500
12: 2 3 rel.change.1 -18.51852
13: 1 1 rel.change.2 NA
14: 1 2 rel.change.2 NA
15: 1 3 rel.change.2 -14.47368
16: 2 1 rel.change.2 NA
17: 2 2 rel.change.2 NA
18: 2 3 rel.change.2 22.22222
19: 1 1 abs.change.1 NA
20: 1 2 abs.change.1 -63.00000
21: 1 3 abs.change.1 52.00000
22: 2 1 abs.change.1 NA
23: 2 2 abs.change.1 11.00000
24: 2 3 abs.change.1 -5.00000
25: 1 1 abs.change.2 NA
26: 1 2 abs.change.2 NA
27: 1 3 abs.change.2 -11.00000
28: 2 1 abs.change.2 NA
29: 2 2 abs.change.2 NA
30: 2 3 abs.change.2 6.00000
PatientID time variable value
And plot
ggplot(melted,aes(x=time,y=value,color=factor(PatientID))) +
geom_point() +
facet_wrap(~variable,scales="free") +
labs(color="PatientID")
这篇关于2-y 轴图,计算绝对值 &R 的相对变化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!