在R中重构重复代码的提示 [英] Tips for refactoring repetitive code in R
问题描述
真正的R程序员如何使用冗余步骤编写代码块?这里我复制/粘贴/编辑每行,这对于这个小分析工作正常,但对于较大的那些是笨重的。在SAS中,我会写宏。在R中寻找生成原则。
这个例子有一些典型的模式,例如重新编码一组连续的和/或有编号模式的列。此外,重新编码逻辑只是将NA替换为0,然后添加1作为输入到另一个期望输入变量为正整数的算法,这里使用 car
库。 p>
data $ rs14_1< - recode(data $ q14_1,5 = 6; 4 = 5; 3 = 4; 2 = 3 ; 1 = 2; 0 = 1; NA = 0)
data $ rs14_2< - recode(data $ q14_2,5 = 6; 4 = 5; 3 = 4; 2 = 3; 1 = 2 ; 0 = 1; NA = 1)
data $ rs14_3< - recode(data $ q14_3,5 = 6; 4 = 5; 3 = 4; 2 = 3; 1 = 2; ; NA = 1)
data $ rs14_4< - recode(data $ q14_4,5 = 6; 4 = 5; 3 = 4; 2 = 3; 1 = 2; 0 = 1; )
data $ rs14_5< - recode(data $ q14_5,5 = 6; 4 = 5; 3 = 4; 2 = 3; 1 = 2; 0 = 1; NA = 1)
data $ rs14_6< - recode(data $ q14_6,5 = 6; 4 = 5; 3 = 4; 2 = 3; 1 = 2; 0 = 1; NA = 1)
data $ rs14_7< - recode(data $ q14_7,5 = 6; 4 = 5; 3 = 4; 2 = 3; 1 = 2; 0 = 1; NA = 1)
data $ rs14_8 < recode(data $ q14_8,5 = 6; 4 = 5; 3 = 4; 2 = 3; 1 = 2; 0 = 1; NA = 1)
data $ rs14_9< q14_9,5 = 6; 4 = 5; 3 = 4; 2 = 3; 1 = 2; 0 = 1; NA = 1)
解决方案假设第一行中的重新编码应该和其他的一样:
重新编码所有数据列,创建一个新的数据框作为结果:
newdata& recode,5 = 6; 4 = 5; 3 = 4; 2 = 3; 1 = 2; 0 = 1; NA = 0)
根据旧数据框设置新数据框的名称:
newdata)< - gsub(^ q,rs,names(newdata))
将它们放在一起:
data < - cbind(data,newdata)
/ pre>
但实际上,你可能应该使用:
newdata< - data
newdata [is.na(newdata)]< - 0
newdata< - newdata + 1
(而不是
recode
)进行转换,然后重命名和cbind
步骤。
(如果您给出了可重复的示例,这将有所帮助。)
How would a real R programmer go about writing blocks of code with redundant steps? Here I'm copy/pasting/editing each line, which works fine for this small analysis, but for bigger ones gets unwieldy. In SAS, I'd write macros. Looking for a generating principle in R.
This example has a few typical patterns, like recoding a set of columns that are consecutive and/or have a numbering pattern. Also the recoding logic is just "replace NA with 0 and then add 1" as input to another algorithm that expects positive integers for input variables, here using the
car
library.data$rs14_1 <- recode(data$q14_1,"5=6;4=5;3=4;2=3;1=2;0=1;NA=0") data$rs14_2 <- recode(data$q14_2,"5=6;4=5;3=4;2=3;1=2;0=1;NA=1") data$rs14_3 <- recode(data$q14_3,"5=6;4=5;3=4;2=3;1=2;0=1;NA=1") data$rs14_4 <- recode(data$q14_4,"5=6;4=5;3=4;2=3;1=2;0=1;NA=1") data$rs14_5 <- recode(data$q14_5,"5=6;4=5;3=4;2=3;1=2;0=1;NA=1") data$rs14_6 <- recode(data$q14_6,"5=6;4=5;3=4;2=3;1=2;0=1;NA=1") data$rs14_7 <- recode(data$q14_7,"5=6;4=5;3=4;2=3;1=2;0=1;NA=1") data$rs14_8 <- recode(data$q14_8,"5=6;4=5;3=4;2=3;1=2;0=1;NA=1") data$rs14_9 <- recode(data$q14_9,"5=6;4=5;3=4;2=3;1=2;0=1;NA=1")
解决方案Assuming that the recoding in the first line is supposed to be the same as the rest:
Recode all data columns, create a new data frame as the result:
newdata <- lapply(data,recode,"5=6;4=5;3=4;2=3;1=2;0=1;NA=0")
Set names for the new data frame based on the old one:
names(newdata) <- gsub("^q","rs",names(newdata))
Put them together:
data <- cbind(data,newdata)
But really, instead you should probably be using:
newdata <- data newdata[is.na(newdata)] <- 0 newdata <- newdata+1
(rather than
recode
) to do the transformation, followed by the renaming andcbind
ing steps.(It would help if you gave a reproducible example.)
这篇关于在R中重构重复代码的提示的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!