如何通过一个因子子集数据框并为每个子集重复一个绘图? [英] How subset a data frame by a factor and repeat a plot for each subset?

查看:140
本文介绍了如何通过一个因子子集数据框并为每个子集重复一个绘图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对R是新手。如果这个问题有明显的答案,请原谅我,但我一直无法找到解决方案。我有SAS的经验,可能只是以错误的方式考虑这个问题。

我有一个来自数百个科目的重复测量的数据集,每个科目有多个测量跨越不同的年龄。每个主题由ID变量标识。我想用AGE为每个单独的主题(ID)绘制每个测量值(比方说身体重量)。



我使用ggplot2来做这样的事情:



pre $ ggplot(data = dataset,aes(x = AGE,y = WEIGHT))+ geom_line()+ facet_wrap(〜ID )

这适用于少量主题,但不适用于整个数据集。 / p>

我也尝试过这样的事情:

  ggplot(data = data,aes(x = AGE,y = BW,group = ID,color = ID))+ geom_line()

这也适用于少数科目,但无法读取数百个科目。

我试过使用这样的代码来进行子集化: p>

  temp < -  split(数据集,数据集$ ID)

但我不确定如何使用生成的数据集。或者也许有一种方法可以简单地调整facet_wrap,以便创建单个图块?

谢谢!

解决方案

因为您想分解数据集并为每个因子级别绘制一个图表,所以我会使用 plyr package。



下面是一个使用 mtcars 数据集的玩具示例。我首先创建该图并将其命名为 p ,然后使用 dlply 将数据集拆分一个因子并返回一个图为每个级别。我利用 ggplot2 中的%+%来替换图中的data.frame。

  p = ggplot(data = mtcars,aes(x = wt,y = mpg))+ 
geom_line()

require(plyr)
dlply(mtcars,。(cyl),function(x)p%+%x)

这将一个接一个地返回所有图。如果您命名结果列表对象,您也可以一次调用一个图。

  plots = dlply(mtcars,。(cyl ),函数(x)p%+%x)
图[1]

Edit



我开始考虑根据因子为每个情节添加标题,看起来好像很有用。 b
$ b

  dlply(mtcars,。(cyl),function(x)p%+%x + facet_wrap(〜cyl))

编辑2

将这些文件保存在单个文档中的方法,每页一个图。这与名为 plots 的图表有关。它将它们全部保存到一个文档,每页一个图。我没有更改 pdf 中的任何默认值,但您肯定可以探索可以进行的更改。

 pdf()

dev.off()

更新以使用包 dplyr 而不是 plyr 。这是在 do 中完成的,并且输出将包含一个包含作为列表的所有图的命名列。

  library(dplyr)
plots = mtcars%>%
group_by(cyl)%>%
do(plots = p%+% + facet_wrap(〜cyl))


来源:本地数据框[3 x 2]
组:<按行>

cyl plot
1 4< S3:gg,ggplot>
2 6< S3:gg,ggplot>
3 8

要查看R中的图,只需询问包含图的列。

 地块$地块

并保存为PDF

  pdf()
地块$地块
dev.off( )


I am new to R. Forgive me if this if this question has an obvious answer but I've not been able to find a solution. I have experience with SAS and may just be thinking of this problem in the wrong way.

I have a dataset with repeated measures from hundreds of subjects with each subject having multiple measurements across different ages. Each subject is identified by an ID variable. I'd like to plot each measurement (let's say body WEIGHT) by AGE for each individual subject (ID).

I've used ggplot2 to do something like this:

ggplot(data = dataset, aes(x = AGE, y = WEIGHT )) + geom_line() + facet_wrap(~ID)

This works well for a small number of subjects but won't work for the entire dataset.

I've also tried something like this:

ggplot(data=data, aes(x = AGE,y = BW, group = ID, colour = ID)) + geom_line()

This also works for a small number of subjects but is unreadable with hundreds of subjects.

I've tried to subset using code like this:

temp <- split(dataset,dataset$ID)

but I'm not sure how to work with the resulting dataset. Or perhaps there is a way to simply adjust the facet_wrap so that individual plots are created?

Thanks!

解决方案

Because you want to split up the dataset and make a plot for each level of a factor, I would approach this with one of the split-apply-return tools from the plyr package.

Here is a toy example using the mtcars dataset. I first create the plot and name it p, then use dlply to split the dataset by a factor and return a plot for each level. I'm taking advantage of %+% from ggplot2 to replace the data.frame in a plot.

p = ggplot(data = mtcars, aes(x = wt, y = mpg)) + 
    geom_line()

require(plyr)
dlply(mtcars, .(cyl), function(x) p %+% x)

This returns all the plots, one after another. If you name the resulting list object you can also call one plot at a time.

plots = dlply(mtcars, .(cyl), function(x) p %+% x)
plots[1]

Edit

I started thinking about putting a title on each plot based on the factor, which seems like it would be useful.

dlply(mtcars, .(cyl), function(x) p %+% x + facet_wrap(~cyl))

Edit 2

Here is one way to save these in a single document, one plot per page. This is working with the list of plots named plots. It saves them all to one document, one plot per page. I didn't change any of the defaults in pdf, but you can certainly explore the changes you can make.

pdf()
plots
dev.off()

Updated to use package dplyr instead of plyr. This is done in do, and the output will have a named column that contains all the plots as a list.

library(dplyr)
plots = mtcars %>%
    group_by(cyl) %>%
    do(plots = p %+% . + facet_wrap(~cyl))


Source: local data frame [3 x 2]
Groups: <by row>

  cyl           plots
1   4 <S3:gg, ggplot>
2   6 <S3:gg, ggplot>
3   8 <S3:gg, ggplot>

To see the plots in R, just ask for the column that contains the plots.

plots$plots

And to save as a pdf

pdf()
plots$plots
dev.off()

这篇关于如何通过一个因子子集数据框并为每个子集重复一个绘图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆