通过考虑r中的分组顺序来操纵字符向量 [英] Manipulating a character vector by considering a grouping sequnce in r

查看:103
本文介绍了通过考虑r中的分组顺序来操纵字符向量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在[这里] [1]之前,我有一个类似的问题,但是这个问题略有不同。

I had a similar question before [here][1] but this one is slightly different.

我有id向量 ids ,分组变量 group 和因子变量 factor ,其初始编号在 _ 变量

I have and id vector ids, a grouping variable group, and a factor variable factor which has the initial numbers before _ in ids variable.

ids <- c("54_a","54_b","44_a","44_c")
 group <- c(1,2)
  factor <- c(54,44)

输出规则:


  1. the fixed [0] 的行应始终等于1。

  2. 当它是第一个因子时,具有<$的行c $ c> fixed [1] 应该等于1。
    ,具有 fixed [2] 的行应该等于0 。

  3. 当它是第二个因子时,具有 fixed [1] 的行应等于0。
    fixed [2] 的行应等于1。

  4. 因此 fixed [ #] 代表f演员数,并且考虑该因素时,此行应等于1。

  5. 需要为两组(G1,G2)重复该过程

  1. the row that has fixed[0] should always equal to 1.
  2. When it is the first factor, the row that has fixed[1] should equal to 1. , the row that has fixed[2] should equal to 0.
  3. When it is the second factor, the row that has fixed[1] should equal to 0. , the row that has fixed[2] should equal to 1.
  4. So the number in the fixed[#] represents the factor number and when that factor is considered, this row should be equal to 1.
  5. The procedure needs to be replicated for the two groups (G1, G2)

我的期望输出如下:

#for the first factor first group
(G1, 54_a, fixed[0]) = 1.0; # this is always 1
(G1, 54_a, fixed[1]) = 1.0; # 1 for factor 1
(G1, 54_a, fixed[2]) = 0.0; # 0 for factor 2

(G1, 54_b, fixed[0]) = 1.0; # this is always 1
(G1, 54_b, fixed[1]) = 1.0; # 1 for factor 1
(G1, 54_b, fixed[2]) = 0.0; # 0 for factor 2


#for the second factor
(G1, 44_a, fixed[0]) = 1.0; # this is always 1
(G1, 44_a, fixed[1]) = 0.0; # 0 for factor 1
(G1, 44_a, fixed[2]) = 1.0; # 1 for factor 2

(G1, 44_c, fixed[0]) = 1.0; # this is always 1
(G1, 44_c, fixed[1]) = 0.0; # 0 for factor 1
(G1, 44_c, fixed[2]) = 1.0; # 1 for factor 2


#for the first factor second group
(G2, 54_a, fixed[0]) = 1.0; # this is always 1
(G2, 54_a, fixed[1]) = 1.0; # 1 for factor 1
(G2, 54_a, fixed[2]) = 0.0; # 0 for factor 2

(G2, 54_b, fixed[0]) = 1.0; # this is always 1
(G2, 54_b, fixed[1]) = 1.0; # 1 for factor 1
(G2, 54_b, fixed[2]) = 0.0; # 0 for factor 2


#for the second factor
(G2, 44_a, fixed[0]) = 1.0; # this is always 1
(G2, 44_a, fixed[1]) = 0.0; # 0 for factor 1
(G2, 44_a, fixed[2]) = 1.0; # 1 for factor 2

(G2, 44_c, fixed[0]) = 1.0; # this is always 1
(G2, 44_c, fixed[1]) = 0.0; # 0 for factor 1
(G2, 44_c, fixed[2]) = 1.0; # 1 for factor 2

我能够为每个输出块生成第一行

I was able to produce the first row for each chunk of output

Fixed.Set.1 <- c()
for(g in 1:length(group)) {
  

  fixed.set.1 <- paste0(paste("(", "G",g,", ",ids, ","," fixed[0]) = 1",collapse="; ", sep=""),"; ")
  Fixed.Set.1 <- c(Fixed.Set.1, fixed.set.1)
}

> Fixed.Set.1
[1] "(G1, 54_a, fixed[0]) = 1; (G1, 54_b, fixed[0]) = 1; (G1, 44_a, fixed[0]) = 1; (G1, 44_c, fixed[0]) = 1; "
[2] "(G2, 54_a, fixed[0]) = 1; (G2, 54_b, fixed[0]) = 1; (G2, 44_a, fixed[0]) = 1; (G2, 44_c, fixed[0]) = 1; "

关于如何处理其余部分的任何想法?
谢谢
[1]: r操作a序列的字符向量

Any ideas on how to deal with the rest? Thanks [1]: r manipulation a character vector for a sequence

推荐答案

首次尝试:

library(stringr)

# define df for all ids and group combinations
group_g <- paste("G", 1:length(group), sep ="")
df <- data.frame(ids, group = rep(group_g, each = length(ids)))

# empty vector
vec <- NULL


for(i in 1:nrow(df)) {
  
  res <- which(str_extract(df[i, "ids"], "[0-9]{2,}") == factor)
  
  text <- paste("(", df[i, "group"], ", ", df[i, "ids"], ", fixed[", c(0:length(factor)) ,"]) = ", ifelse(res == 0:length(factor) | 0 == 0:length(factor), "1.0", "0.0"),";", sep = "")
  
  vec <- c(vec, text)
}

vec

这篇关于通过考虑r中的分组顺序来操纵字符向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆