将函数应用于组的组合,固定1个组 [英] applying a function to combinations of groups, holding 1 group fixed

查看:53
本文介绍了将函数应用于组的组合,固定1个组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些看起来像这样的数据

I have some data which looks like:

   grp    date                id              Y
   <chr>  <dttm>              <chr>       <dbl>
 1 group1 2020-09-01 00:00:00 04003      17039.
 2 group1 2020-09-01 00:00:00 04006      13233.
 3 group1 2020-09-01 00:00:00 04011_AM    7918.
 4 group1 2020-09-01 00:00:00 0401301_AD 22586.
 5 group1 2020-09-01 00:00:00 0401303    20527.
 6 group1 2020-09-01 00:00:00 0401305    29422.
 7 group2 2020-09-01 00:00:00 22017_AM    7088.
 8 group2 2020-09-01 00:00:00 22021_AM    8134.
 9 group2 2020-09-01 00:00:00 22039_AM   15842.
10 group2 2020-09-01 00:00:00 22048      16142.

其中有不同的组.我还有一个功能:

Which has different groups. I also have a function:

normaliseData <-function(m){
  (m - min(m)) / (max(m) - min(m))
}

我想通过成对值的最小值和最大值对组进行归一化,并保持 group1 不变.也就是说,我要对固定 group1 的数据进行归一化,因此它将具有以下组合.

I want to normalise the groups by the min and max of the pairwise values, holding group1 fixed. That is, I want to normalise the data fixing group1 so it will have the following combinations.

  • group1 & group2
  • group1 & group3
  • group1 & group4
  • group1 & group2
  • group1 & group3
  • group1 & group4

数据:

data <- structure(list(grp = c("group1", "group1", "group1", "group1", 
"group1", "group1", "group2", "group2", "group2", "group2", "group2", 
"group2", "group3", "group3", "group3", "group3", "group3", "group3", 
"group4", "group4", "group4", "group4", "group4", "group4"), 
    date = structure(c(1598918400, 1598918400, 1598918400, 1598918400, 
    1598918400, 1598918400, 1598918400, 1598918400, 1598918400, 
    1598918400, 1598918400, 1598918400, 1598918400, 1598918400, 
    1598918400, 1598918400, 1598918400, 1598918400, 1598918400, 
    1598918400, 1598918400, 1598918400, 1598918400, 1598918400
    ), tzone = "UTC", class = c("POSIXct", "POSIXt")), id = c("04003", 
    "04006", "04011_AM", "0401301_AD", "0401303", "0401305", 
    "22017_AM", "22021_AM", "22039_AM", "22048", "22053_AM", 
    "22054_AM", "28002", "28004", "2800501", "2800502", "2800503", 
    "2800504", "31010_AM", "31015_AM", "31016", "31019_AM", "31023", 
    "31029_AM"), Y = c(17039.329, 13232.982, 7917.693, 22585.676, 
    20527.113, 29422.471, 7087.536, 8134.265, 15842.035, 16142.111, 
    11493.981, 6556.387, 22086.768, 11325.882, 53449.067, 83662.101, 
    78508.089, 66107.125, 5095.169, 5590.531, 17796.439, 6028.701, 
    39271.698, 3642.281)), row.names = c(NA, -24L), groups = structure(list(
    grp = c("group1", "group2", "group3", "group4"), .rows = structure(list(
        1:6, 7:12, 13:18, 19:24), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, 4L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

我希望应用以下内容:

#Min / max from group1 and group2
    data %>% 
      filter(grp == "group1" | grp == "group2") %>% 
      mutate(
        normedOut = normaliseData(Y)
      )

#Min / max from group1 and group3
    data %>% 
      filter(grp == "group1" | grp == "group3") %>% 
      mutate(
        normedOut = normaliseData(Y)
      )

#Min / max from group1 and group4
    data %>% 
      filter(grp == "group1" | grp == "group4") %>% 
      mutate(
        normedOut = normaliseData(Y)
      )

推荐答案

根据我对您的问题的理解,这里是 purrr 的一个选项.我们创建一个向量 groups ,其中包含我们感兴趣的三个对固定group1的对的循环的组.我们使用您所需的过滤器和突变序列,然后在包含规范化数据的 groups 向量中创建为每个组命名的列.这将导致一个数据帧包含3个新列,每个列代表组1和另一组之间的归一化Y.NA将填充没有配对的地方(例如,在group2和group3之间)

Here is one option with purrr based on what I understand from your question. We create a vector, groups, that contains the groups we are interested in looping over for our three pairs holding group1 fixed. We use your desired filter and mutate sequence and then create columns named for each group in our groups vector that contains the normalized data. This will result in a dataframe that contains 3 new columns, each column representing the normalized Y between group 1 and another group. NAs will populate where there is no pair (e.g. between group2 and group3)

groups <- c("group2", "group3", "group4")
groups %>%
  purrr::map_dfr(~ data %>%
        filter(grp == "group1" | grp == .x) %>%
        mutate(!!.x := normaliseData(Y)))

这篇关于将函数应用于组的组合,固定1个组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆