根据列名创建多个图形 [英] Creating multiple graphs based upon the column names

查看:25
本文介绍了根据列名创建多个图形的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我在 stackoverlow 上的第一个问题,如果我没有遵循正确的问题协议,请纠正我.

我正在尝试为在三个时间点(时间 1、时间 2、时间 3)上收集的数据创建一些图表,这些数据等于开始时的 X1...、X2... 和 X3...列名.这些图也由 $Group 列与数据框分隔.

我创建图表没有问题,我只有很多变量(~170),我想比较时间 1 与时间 2、时间 2 与时间 3 等.所以我尝试使用快捷方式来运行它一种代码,而不必单独输入每个.

如上所述,我创建了变量名,如 X1... X2... 表示变量被记录的时间,即 X1BCSTCAT = time 1;X2BCSTCAT = 时间 2;X3BCSTCAT = 时间 3.这是我的数据的一个小样本:

df <- structure(list(ID = structure(1:6, .Label = c("101","102","103","118","119","120"), 类 = "因子"),组 = 结构(c(1L,1L,1L,2L,2L,2L), .Label = c("C8","TC"), class = "factor"),Wave = 结构(c(1L, 2L, 3L, 4L, 1L, 2L), .Label = c("A","B","C","D"), class = "factor"),Yr = 结构(c(1L, 2L, 1L, 2L, 1L, 2L), .Label = c("3","5"), class = c("ordered", "factor")),年龄.Yr.= c(10.936,10.936, 9.311, 10.881, 10.683, 11.244),培训..小时.= c(10.667,10.333, 10.667, 10.333, 10.333, 10.333),X1BCSTCAT = c(-0.156,0.637,-1.133,0.637,2.189,1.229),X1BCSTCR = c(0.484,0.192, -1.309, 0.912, 1.902, 0.484),X1BCSTPR = c(-1.773,0.859, 0.859, 0.12, -1.111, 0.12),X2BCSTCAT = c(1.006, -0.379,-1.902, 0.444, 2.074, 1.006),X2BCSTCR = c(0.405, -0.457,-1.622, 1.368, 1.981, 0.168),X2BCSTPR = c(-0.511, -0.036,2.189, -0.036, -0.894, 0.949),X3BCSTCAT = c(1.18, -1.399,-1.399, 1.18, 1.18, 1.18),X3BCSTCR = c(0.967, -1.622, -1.622,0.967, 0.967, 1.255),X3BCSTPR = c(-1.282, -1.282, 1.539,1.539, 0.792, 0.792)),row.names = c(1L, 2L, 3L, 4L, 5L,8L), class = "data.frame")

以下是一些工作代码,可使用 ggplot 为一个变量的时间 1 与时间 2 数据创建一个图表:

库(ggplot2)p <- ggplot(df, aes(x=df$X1BCSTCAT, y=df$X2BCSTCAT, shape = df$Group, color = df$Group)) +geom_point() + geom_smooth(method=lm, aes(fill=df$Group), fullrange = TRUE) +labs(title="BCSTCAT", x="Time 1", y = "Time 2") +scale_color_manual(name = "Group",labels = c("C8","TC"),values = c("blue", "red")) +scale_shape_manual(name = "Group",labels = c("C8","TC"),values = c(16, 17)) +scale_fill_manual(name = "Group",labels = c("C8", "TC"),values = c("light blue", "pink"))

所以我真的想创建某种快捷方式,R 将在其中循环并匹配变量名称 X1... vs X2... 等等,然后创建图形.我认为必须有某种方法可以根据匹配的列号进行绘图,例如df[,7] 与 df[,10] 并遍历此过程或通过实际匹配名称进行绘图(其中变量名称的唯一区别是表示时间的数字).

我之前曾使用 lapply 函数创建单个图形,但不知道从哪里开始尝试执行此操作.

解决方案

使用 tidyeval 方法的解决方案.我们将需要 (v0.2.0) 于 2018 年 5 月 24 日创建.

This is my first question on stackoverlow, please correct me if I am not following correct question protocols.

I am trying to create some graphs for data that has been collected over three time points (time 1, time 2, time 3) which equates to X1..., X2... and X3... at the beginning of column names. The graphs are also separated by the column $Group from the data frame.

I have no problem creating the graphs, I just have many variables (~170) and am wanting to compare time 1 vs time 2, time 2 vs time 3, etc. so am trying to work a shortcut to be running this kind of code rather than having to type out each one individually.

As indicated above, I have created variable names like X1... X2... which indicate the time that the variable was recorded i.e. X1BCSTCAT = time 1; X2BCSTCAT = time 2; X3BCSTCAT = time 3. Here is a small sample of what my data looks like:

df <- structure(list(ID = structure(1:6, .Label = c("101","102","103","118","119","120"), class = "factor"), 
                   Group = structure(c(1L,1L,1L,2L,2L,2L), .Label = c("C8","TC"), class = "factor"), 
                   Wave = structure(c(1L, 2L, 3L, 4L, 1L, 2L), .Label = c("A","B","C","D"), class = "factor"), 
                   Yr = structure(c(1L, 2L, 1L, 2L, 1L, 2L), .Label = c("3","5"), class = c("ordered", "factor")), 
                   Age.Yr. = c(10.936,10.936, 9.311, 10.881, 10.683, 11.244), 
                   Training..hr. = c(10.667,10.333, 10.667, 10.333, 10.333, 10.333), 
                   X1BCSTCAT = c(-0.156,0.637,-1.133,0.637,2.189,1.229), 
                   X1BCSTCR = c(0.484,0.192, -1.309, 0.912, 1.902, 0.484), 
                   X1BCSTPR = c(-1.773,0.859, 0.859, 0.12, -1.111, 0.12), 
                   X2BCSTCAT = c(1.006, -0.379,-1.902, 0.444, 2.074, 1.006), 
                   X2BCSTCR = c(0.405, -0.457,-1.622, 1.368, 1.981, 0.168), 
                   X2BCSTPR = c(-0.511, -0.036,2.189, -0.036, -0.894, 0.949),
                   X3BCSTCAT = c(1.18, -1.399,-1.399, 1.18, 1.18, 1.18), 
                   X3BCSTCR = c(0.967, -1.622, -1.622,0.967, 0.967, 1.255), 
                   X3BCSTPR = c(-1.282, -1.282, 1.539,1.539, 0.792, 0.792)), 
              row.names = c(1L, 2L, 3L, 4L, 5L,8L), class = "data.frame")

Here is some working code to create one graph using ggplot for time 1 vs time 2 data on one variable:

library(ggplot2)

p <- ggplot(df, aes(x=df$X1BCSTCAT, y=df$X2BCSTCAT, shape = df$Group, color = df$Group)) + 
  geom_point() + geom_smooth(method=lm, aes(fill=df$Group), fullrange = TRUE) + 
  labs(title="BCSTCAT", x="Time 1", y = "Time 2") + 
  scale_color_manual(name = "Group",labels = c("C8","TC"),values = c("blue", "red")) +
  scale_shape_manual(name = "Group",labels = c("C8","TC"),values = c(16, 17)) +
  scale_fill_manual(name = "Group",labels = c("C8", "TC"),values = c("light blue", "pink"))

So I am really trying to create some kind of a shortcut where R will cycle through and match up variable names X1... vs X2... and so on and create the graphs. I assume there must be some way to plot either based upon matching column numbers e.g. df[,7] vs df[,10] and iterating through this process or plotting by actually matching the names (where the only difference in variable names is the number which indicates time).

I have previously cycled through creating individual graphs using the lapply function, but have no idea where to even start with trying to do this one.

解决方案

A solution using tidyeval approach. We will need ggplot2 v3.0.0 (remember to restart your R session)

install.packages("ggplot2", dependencies = TRUE)

  • First we build a function that takes column and group names as inputs. Note the use of rlang::sym, rlang::quo_name & !!.

  • Then create 2 name vectors for x- & y- values so that we can loop through them simultaneously using purrr::map2.

library(rlang)
library(tidyverse)

df <- structure(list(ID = structure(1:6, .Label = c("101","102","103","118","119","120"), class = "factor"), 
                   Group = structure(c(1L,1L,1L,2L,2L,2L), .Label = c("C8","TC"), class = "factor"), 
                   Wave = structure(c(1L, 2L, 3L, 4L, 1L, 2L), .Label = c("A","B","C","D"), class = "factor"), 
                   Yr = structure(c(1L, 2L, 1L, 2L, 1L, 2L), .Label = c("3","5"), class = c("ordered", "factor")), 
                   Age.Yr. = c(10.936,10.936, 9.311, 10.881, 10.683, 11.244), 
                   Training..hr. = c(10.667,10.333, 10.667, 10.333, 10.333, 10.333), 
                   X1BCSTCAT = c(-0.156,0.637,-1.133,0.637,2.189,1.229), 
                   X1BCSTCR = c(0.484,0.192, -1.309, 0.912, 1.902, 0.484), 
                   X1BCSTPR = c(-1.773,0.859, 0.859, 0.12, -1.111, 0.12), 
                   X2BCSTCAT = c(1.006, -0.379,-1.902, 0.444, 2.074, 1.006), 
                   X2BCSTCR = c(0.405, -0.457,-1.622, 1.368, 1.981, 0.168), 
                   X2BCSTPR = c(-0.511, -0.036,2.189, -0.036, -0.894, 0.949),
                   X3BCSTCAT = c(1.18, -1.399,-1.399, 1.18, 1.18, 1.18), 
                   X3BCSTCR = c(0.967, -1.622, -1.622,0.967, 0.967, 1.255), 
                   X3BCSTPR = c(-1.282, -1.282, 1.539,1.539, 0.792, 0.792)), 
              row.names = c(1L, 2L, 3L, 4L, 5L,8L), class = "data.frame")

# define a function that accept strings as input
pair_plot <- function(x_var, y_var, group_var) {

  # convert strings to symbols
  x_var <- rlang::sym(x_var)
  y_var <- rlang::sym(y_var)
  group_var <- rlang::sym(group_var)

  # unquote symbols using !! 
  ggplot(df, aes(x = !! x_var, y = !! y_var, shape = !! group_var, color = !! group_var)) + 
    geom_point() + geom_smooth(method = lm, aes(fill = !! group_var), fullrange = TRUE) + 
    labs(title = "BCSTCAT", x = rlang::quo_name(x_var), y = rlang::quo_name(y_var)) +
    scale_color_manual(name = "Group", labels = c("C8", "TC"), values = c("blue", "red")) +
    scale_shape_manual(name = "Group", labels = c("C8", "TC"), values = c(16, 17)) +
    scale_fill_manual(name = "Group",  labels = c("C8", "TC"), values = c("light blue", "pink")) +
    theme_bw()
}

# Test if the new function works
pair_plot("X1BCSTCAT", "X2BCSTCAT", "Group")

# Create 2 parallel lists 
list_x <- colnames(df)[-c(1:6, (ncol(df)-2):(ncol(df)))]
list_x
#> [1] "X1BCSTCAT" "X1BCSTCR"  "X1BCSTPR"  "X2BCSTCAT" "X2BCSTCR"  "X2BCSTPR"

list_y <- lead(colnames(df)[-(1:6)], 3)[1:length(list_x)]
list_y
#> [1] "X2BCSTCAT" "X2BCSTCR"  "X2BCSTPR"  "X3BCSTCAT" "X3BCSTCR"  "X3BCSTPR"

# Loop through 2 lists simultaneously 
# Supply inputs to pair_plot function using purrr::map2
map2(list_x, list_y, ~ pair_plot(.x, .y, "Group"))

Sample outputs:

#> [[1]]

#> 
#> [[2]]

Created on 2018-05-24 by the reprex package (v0.2.0).

这篇关于根据列名创建多个图形的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆