通过考虑r中的分组顺序来操纵字符向量(4) [英] Manipulating a character vector by considering a grouping sequnce in r (4)

查看:66
本文介绍了通过考虑r中的分组顺序来操纵字符向量(4)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图基于Group变量item.map编写代码,该项目具有包含q矩阵的项目信息,该q矩阵显示哪个项目与哪个组相关联。

I am trying to write code based on a Group variable, item.map that has item information that includes a q-matrix showing which item is associated with which group.

group <- c(1,2)
ids <- c("54_a","54_b","44_a","44_c")
item.map <- data.frame(
  item.id = c("54_a","54_b","44_a","44_c"),
  group.1 = c(1,1,1,0),
  group.2 = c(0,1,0,1))

factor <- c(54,44)

在此item.map中group.1有3个项目,而group.2有2个项目。我想使用此item.map在下面的代码块中分配这些项目,但无法插入item.map信息。

In this item.map group.1 had 3 items while group.2 has two items. Using this item.map I wanted to assign those items within the chunk of code below but I was not able to plug the item.map information.

library(stringr)

# define df for all ids and group combinations
group_g <- paste("G", 1:length(group), sep ="")
df <- data.frame(ids, group = rep(group_g, each = length(ids)))

# empty vector
vec <- NULL
for(i in 1:nrow(df)) {
  
  res <- which(str_extract(df[i, "ids"], "[0-9]{2,}") == factor)
  
  text <- paste("(", df[i, "group"], ", ", df[i, "ids"], ", fixed[", c(0:length(factor)) ,"]) = ", ifelse(res == 0:length(factor) | 0 == 0:length(factor), "1.0", "0.0"),";", sep = "")
  
  vec <- c(vec, text)
}

    > vec
"(G1, 54_a, fixed[0]) = 1.0;" "(G1, 54_a, fixed[1]) = 1.0;" "(G1, 54_a, fixed[2]) = 0.0;" 
"(G1, 54_b, fixed[0]) = 1.0;" "(G1, 54_b, fixed[1]) = 1.0;" "(G1, 54_b, fixed[2]) = 0.0;" 
"(G1, 44_a, fixed[0]) = 1.0;" "(G1, 44_a, fixed[1]) = 0.0;" "(G1, 44_a, fixed[2]) = 1.0;" 
"(G1, 44_c, fixed[0]) = 1.0;" "(G1, 44_c, fixed[1]) = 0.0;" "(G1, 44_c, fixed[2]) = 1.0;" 
"(G2, 54_a, fixed[0]) = 1.0;" "(G2, 54_a, fixed[1]) = 1.0;" "(G2, 54_a, fixed[2]) = 0.0;"
"(G2, 54_b, fixed[0]) = 1.0;" "(G2, 54_b, fixed[1]) = 1.0;" "(G2, 54_b, fixed[2]) = 0.0;" 
"(G2, 44_a, fixed[0]) = 1.0;" "(G2, 44_a, fixed[1]) = 0.0;" "(G2, 44_a, fixed[2]) = 1.0;" 
"(G2, 44_c, fixed[0]) = 1.0;" "(G2, 44_c, fixed[1]) = 0.0;" "(G2, 44_c, fixed[2]) = 1.0;"

因此,根据所需输出中的 item.map ,G1不应包含项44_c,G2不应具有项54_a和44_a

So, based on the item.map in the desired output, G1 should not have item 44_c and G2 should not have items 54_a and 44_a

所需的输出为:

> vec
"(G1, 54_a, fixed[0]) = 1.0;" "(G1, 54_a, fixed[1]) = 1.0;" "(G1, 54_a, fixed[2]) = 0.0;" 
"(G1, 54_b, fixed[0]) = 1.0;" "(G1, 54_b, fixed[1]) = 1.0;" "(G1, 54_b, fixed[2]) = 0.0;" 
"(G1, 44_a, fixed[0]) = 1.0;" "(G1, 44_a, fixed[1]) = 0.0;" "(G1, 44_a, fixed[2]) = 1.0;" 
"(G2, 54_b, fixed[0]) = 1.0;" "(G2, 54_b, fixed[1]) = 1.0;" "(G2, 54_b, fixed[2]) = 0.0;"
"(G2, 44_c, fixed[0]) = 1.0;" "(G2, 44_c, fixed[1]) = 0.0;" "(G2, 44_c, fixed[2]) = 1.0;"


推荐答案

这是一个主意。我将您的 item.map 数据集重整为长格式。因此, item.map 具有与旧数据集 df 相同的结构,但带有附加列已使用,并带有必需的0和1。

Here is an idea. I reshaped your item.map dataset into a long format. Therefore item.map got the same structur as your old dataset df, but with the additional column used with the required 0 and 1.

在下一步中,我添加了 if 函数在循环中,因此 vec 中仅包含1行。

In the next step I added an if-function in the loop, so just rows with 1 will be included in vec.

library(stringr)

# original dataset item.map
group <- c(1,2)
ids <- c("54_a","54_b","44_a","44_c")
item.map <- data.frame(
  item.id = c("54_a","54_b","44_a","44_c"),
  group.1 = c(1,1,1,0),
  group.2 = c(0,1,0,1))

factor <- c(54,44)

# reshape item.map 
item.map2 <- item.map %>%
  pivot_longer(-item.id, 
               names_to = "group",
               values_to = "used") %>%
  arrange(group) %>%
  mutate(group = str_replace(group, "group.", "G"),
         item.id = as.character(item.id))

# empty vector
vec <- NULL
for(i in 1:nrow(item.map2)) {
  if(item.map2[i, "used"] == 1) {
  res <- which(str_extract(item.map2[i, "item.id"], "[0-9]{2,}") == factor)
  
  text <- paste("(", item.map2[i, "group"], ", ", item.map2[i, "item.id"],
                ", fixed[", c(0:length(factor)) ,"]) = ", 
                ifelse(res == 0:length(factor) | 0 == 0:length(factor), 
                       "1.0", "0.0"),";", sep = "")
  
  vec <- c(vec, text)
  }
}

vec

输出

[1] "(G1, 54_a, fixed[0]) = 1.0;" "(G1, 54_a, fixed[1]) = 1.0;" "(G1, 54_a, fixed[2]) = 0.0;"
 [4] "(G1, 54_b, fixed[0]) = 1.0;" "(G1, 54_b, fixed[1]) = 1.0;" "(G1, 54_b, fixed[2]) = 0.0;"
 [7] "(G1, 44_a, fixed[0]) = 1.0;" "(G1, 44_a, fixed[1]) = 0.0;" "(G1, 44_a, fixed[2]) = 1.0;"
[10] "(G2, 54_b, fixed[0]) = 1.0;" "(G2, 54_b, fixed[1]) = 1.0;" "(G2, 54_b, fixed[2]) = 0.0;"
[13] "(G2, 44_c, fixed[0]) = 1.0;" "(G2, 44_c, fixed[1]) = 0.0;" "(G2, 44_c, fixed[2]) = 1.0;"

这篇关于通过考虑r中的分组顺序来操纵字符向量(4)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆