R中的SAS阵列等效 [英] SAS Array equivalent in R
问题描述
我有一个包含以下列的数据集:
I have a dataset with following columns :
ID Measure1 Measure2 XO X1 x2 x3 x4 x5
1 30 2 item1 item1 item23 NA item6 item9
2 23 2 item1 item323 item1 item4 item5 NA
3 2 2 item1 item78 item3 NA item1 item5
,我想用R中的这段SAS代码创建一个标志变量:
and I want to create a flag variable with this this piece of SAS Code in R:
data dt2;
set dt1;
array x {5} x1 - x5;
do i=1 to 5;
if x0=x{i} then do;
flag=i;
leave;
end;
end;
drop i;
run;
目标是能够浏览x1-x5的值并查看xo相等的地方到它们中的任何一个并返回位置,例如,如果在x1处找到item1,然后在x3处找到我,则返回值1,返回3。
The goal is to be able to browse thorugh the values of x1-x5 and see where xo is equal to any of them and return the position , for example if item1 is found at x1 then return to me value 1 if found in position x3 return 3.
最终产品将看起来像这样:
The end product would look something like this:
ID Measure1 Measure2 XO X1 x2 x3 x4 x5 Flag
1 30 2 item1 item1 item23 NA item6 item9 1
2 23 2 item1 item323 item1 item4 item5 NA 2
3 2 2 item1 item78 item3 NA item1 item5 4
请记住,在某些情况下rom x1-x5的所有行都包含NA,在这种情况下我想返回空白,这可能吗?
Keep in mind that there might be cases where all rows rom x1-x5 contain NA, in that case i would like to return blank, is this possible?
我无法在R中找到与动态相同的东西(没有写多个if语句或使用sqldf时的情况),因为现在列m可能是5,但将来最多可以更改为20。
I haven’t been able to find in R something equivalent in the sense of being dynamic (without writing multiple if statements or case when with sqldf) because now the columns might be 5 but can alter in the future to up to 20.
有什么想法吗?
推荐答案
我们可以使用 max.col
df1$Flag <- max.col(df1$XO[row(df1[-1])]==df1[-1], 'first')
df1
# XO X1 x2 x3 x4 x5 Flag
#1 item1 item1 item23 item5 item6 item9 1
#2 item1 item323 item1 item4 item5 itm87 2
#3 item1 item78 item3 item98 item1 item5 4
更新
根据更新后的数据集,我们可以替换 FALSE
的逻辑矩阵,然后使用 max.col
。如果连续没有TRUE值,我们可以通过获取 rowSums
将其设置为 NA
,为0,将值为0的值更改为NA( NA ^ ..
)并乘以 max.col(。
。
Update
Based on the updated dataset, we can replace the NA elements in the logical matrix with FALSE
and then use max.col
. If there are no TRUE values in a row, we can make it to NA
by getting the rowSums
, check whether it is 0, change values that are 0 to NA (NA^..
) and multiply with max.col(.
.
df3 <- df2[5:ncol(df2)]
i1 <- df2$XO[row(df3)]==df3
i2 <- replace(i1, is.na(i1), FALSE)
df2$Flag <- max.col(i2, 'first') * NA^(rowSums(i2)==0)
df2
# ID Measure1 Measure2 XO X1 x2 x3 x4 x5 Flag
#1 1 30 2 item1 item1 item23 <NA> item6 item9 1
#2 2 23 2 item1 item323 item1 item4 item5 <NA> 2
#3 3 2 2 item1 item78 item3 <NA> item1 item5 4
这篇关于R中的SAS阵列等效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!