根据列名称创建多个图形 [英] Creating multiple graphs based upon the column names

查看:86
本文介绍了根据列名称创建多个图形的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我对stackoverlow的第一个问题,如果我没有遵循正确的问题协议,请纠正我.

This is my first question on stackoverlow, please correct me if I am not following correct question protocols.

我正在尝试为在三个时间点(时间1,时间2,时间3)收集的数据创建一些图表,这些数据在开始时等于X1 ...,X2 ...和X3 ...列名称.图形也由数据框的$ Group列分隔.

I am trying to create some graphs for data that has been collected over three time points (time 1, time 2, time 3) which equates to X1..., X2... and X3... at the beginning of column names. The graphs are also separated by the column $Group from the data frame.

我创建图表没有问题,我只有很多变量(〜170),并且想要比较时间1与时间2,时间2与时间3等,因此我想尝试一种快捷方式来运行该图表一种代码,而不必分别输入每个代码.

I have no problem creating the graphs, I just have many variables (~170) and am wanting to compare time 1 vs time 2, time 2 vs time 3, etc. so am trying to work a shortcut to be running this kind of code rather than having to type out each one individually.

如上所述,我已经创建了变量名,例如X1 ... X2 ...,它表示记录变量的时间,即X1BCSTCAT = time 1; X2BCSTCAT =时间2; X3BCSTCAT =时间3.这是我的数据的一小部分样本:

As indicated above, I have created variable names like X1... X2... which indicate the time that the variable was recorded i.e. X1BCSTCAT = time 1; X2BCSTCAT = time 2; X3BCSTCAT = time 3. Here is a small sample of what my data looks like:

df <- structure(list(ID = structure(1:6, .Label = c("101","102","103","118","119","120"), class = "factor"), 
                   Group = structure(c(1L,1L,1L,2L,2L,2L), .Label = c("C8","TC"), class = "factor"), 
                   Wave = structure(c(1L, 2L, 3L, 4L, 1L, 2L), .Label = c("A","B","C","D"), class = "factor"), 
                   Yr = structure(c(1L, 2L, 1L, 2L, 1L, 2L), .Label = c("3","5"), class = c("ordered", "factor")), 
                   Age.Yr. = c(10.936,10.936, 9.311, 10.881, 10.683, 11.244), 
                   Training..hr. = c(10.667,10.333, 10.667, 10.333, 10.333, 10.333), 
                   X1BCSTCAT = c(-0.156,0.637,-1.133,0.637,2.189,1.229), 
                   X1BCSTCR = c(0.484,0.192, -1.309, 0.912, 1.902, 0.484), 
                   X1BCSTPR = c(-1.773,0.859, 0.859, 0.12, -1.111, 0.12), 
                   X2BCSTCAT = c(1.006, -0.379,-1.902, 0.444, 2.074, 1.006), 
                   X2BCSTCR = c(0.405, -0.457,-1.622, 1.368, 1.981, 0.168), 
                   X2BCSTPR = c(-0.511, -0.036,2.189, -0.036, -0.894, 0.949),
                   X3BCSTCAT = c(1.18, -1.399,-1.399, 1.18, 1.18, 1.18), 
                   X3BCSTCR = c(0.967, -1.622, -1.622,0.967, 0.967, 1.255), 
                   X3BCSTPR = c(-1.282, -1.282, 1.539,1.539, 0.792, 0.792)), 
              row.names = c(1L, 2L, 3L, 4L, 5L,8L), class = "data.frame")

以下是一些工作代码,可使用ggplot为一个变量的时间1和时间2数据创建一个图形:

Here is some working code to create one graph using ggplot for time 1 vs time 2 data on one variable:

library(ggplot2)

p <- ggplot(df, aes(x=df$X1BCSTCAT, y=df$X2BCSTCAT, shape = df$Group, color = df$Group)) + 
  geom_point() + geom_smooth(method=lm, aes(fill=df$Group), fullrange = TRUE) + 
  labs(title="BCSTCAT", x="Time 1", y = "Time 2") + 
  scale_color_manual(name = "Group",labels = c("C8","TC"),values = c("blue", "red")) +
  scale_shape_manual(name = "Group",labels = c("C8","TC"),values = c(16, 17)) +
  scale_fill_manual(name = "Group",labels = c("C8", "TC"),values = c("light blue", "pink"))

因此,我实际上是在尝试创建某种快捷方式,其中R将循环遍历并匹配变量名称X1 ... vs X2 ...,依此类推,并创建图形.我假设必须有某种方法可以根据匹配的列号绘制任何一个,例如df [,7]与df [,10]进行迭代,或者通过实际匹配名称(在变量名称中唯一的区别是表示时间的数字)重复进行绘制或绘图.

So I am really trying to create some kind of a shortcut where R will cycle through and match up variable names X1... vs X2... and so on and create the graphs. I assume there must be some way to plot either based upon matching column numbers e.g. df[,7] vs df[,10] and iterating through this process or plotting by actually matching the names (where the only difference in variable names is the number which indicates time).

我之前曾经循环使用lapply函数创建单个图形,但不知道从哪里开始尝试做一个图形.

I have previously cycled through creating individual graphs using the lapply function, but have no idea where to even start with trying to do this one.

推荐答案

使用tidyeval方法的解决方案.我们将需要 ggplot2 v3.0.0 (请记住,重新启动您的R会话)

A solution using tidyeval approach. We will need ggplot2 v3.0.0 (remember to restart your R session)

install.packages("ggplot2", dependencies = TRUE)

  • 首先,我们构建一个将列名和组名作为输入的函数.注意rlang::symrlang::quo_name& !!.

    • First we build a function that takes column and group names as inputs. Note the use of rlang::sym, rlang::quo_name & !!.

      然后为x-& 2创建2个名称向量. y-值,以便我们可以使用purrr::map2同时遍历它们.

      Then create 2 name vectors for x- & y- values so that we can loop through them simultaneously using purrr::map2.

      library(rlang)
      library(tidyverse)
      
      df <- structure(list(ID = structure(1:6, .Label = c("101","102","103","118","119","120"), class = "factor"), 
                         Group = structure(c(1L,1L,1L,2L,2L,2L), .Label = c("C8","TC"), class = "factor"), 
                         Wave = structure(c(1L, 2L, 3L, 4L, 1L, 2L), .Label = c("A","B","C","D"), class = "factor"), 
                         Yr = structure(c(1L, 2L, 1L, 2L, 1L, 2L), .Label = c("3","5"), class = c("ordered", "factor")), 
                         Age.Yr. = c(10.936,10.936, 9.311, 10.881, 10.683, 11.244), 
                         Training..hr. = c(10.667,10.333, 10.667, 10.333, 10.333, 10.333), 
                         X1BCSTCAT = c(-0.156,0.637,-1.133,0.637,2.189,1.229), 
                         X1BCSTCR = c(0.484,0.192, -1.309, 0.912, 1.902, 0.484), 
                         X1BCSTPR = c(-1.773,0.859, 0.859, 0.12, -1.111, 0.12), 
                         X2BCSTCAT = c(1.006, -0.379,-1.902, 0.444, 2.074, 1.006), 
                         X2BCSTCR = c(0.405, -0.457,-1.622, 1.368, 1.981, 0.168), 
                         X2BCSTPR = c(-0.511, -0.036,2.189, -0.036, -0.894, 0.949),
                         X3BCSTCAT = c(1.18, -1.399,-1.399, 1.18, 1.18, 1.18), 
                         X3BCSTCR = c(0.967, -1.622, -1.622,0.967, 0.967, 1.255), 
                         X3BCSTPR = c(-1.282, -1.282, 1.539,1.539, 0.792, 0.792)), 
                    row.names = c(1L, 2L, 3L, 4L, 5L,8L), class = "data.frame")
      
      # define a function that accept strings as input
      pair_plot <- function(x_var, y_var, group_var) {
      
        # convert strings to symbols
        x_var <- rlang::sym(x_var)
        y_var <- rlang::sym(y_var)
        group_var <- rlang::sym(group_var)
      
        # unquote symbols using !! 
        ggplot(df, aes(x = !! x_var, y = !! y_var, shape = !! group_var, color = !! group_var)) + 
          geom_point() + geom_smooth(method = lm, aes(fill = !! group_var), fullrange = TRUE) + 
          labs(title = "BCSTCAT", x = rlang::quo_name(x_var), y = rlang::quo_name(y_var)) +
          scale_color_manual(name = "Group", labels = c("C8", "TC"), values = c("blue", "red")) +
          scale_shape_manual(name = "Group", labels = c("C8", "TC"), values = c(16, 17)) +
          scale_fill_manual(name = "Group",  labels = c("C8", "TC"), values = c("light blue", "pink")) +
          theme_bw()
      }
      
      # Test if the new function works
      pair_plot("X1BCSTCAT", "X2BCSTCAT", "Group")
      

      # Create 2 parallel lists 
      list_x <- colnames(df)[-c(1:6, (ncol(df)-2):(ncol(df)))]
      list_x
      #> [1] "X1BCSTCAT" "X1BCSTCR"  "X1BCSTPR"  "X2BCSTCAT" "X2BCSTCR"  "X2BCSTPR"
      
      list_y <- lead(colnames(df)[-(1:6)], 3)[1:length(list_x)]
      list_y
      #> [1] "X2BCSTCAT" "X2BCSTCR"  "X2BCSTPR"  "X3BCSTCAT" "X3BCSTCR"  "X3BCSTPR"
      
      # Loop through 2 lists simultaneously 
      # Supply inputs to pair_plot function using purrr::map2
      map2(list_x, list_y, ~ pair_plot(.x, .y, "Group"))
      

      样本输出:

      #> [[1]]
      

      #> 
      #> [[2]]
      

      reprex软件包(v0.2.0)于2018-05-24创建.

      Created on 2018-05-24 by the reprex package (v0.2.0).

      这篇关于根据列名称创建多个图形的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆