如何使用一个列来确定从哪里获取另一列的值? [英] How can I use one column to determine where I get the value for another column?
问题描述
我正在尝试使用一列来确定将哪一列用作另一列的值。
I'm trying to use one column to determine which column to use as the value for another column It looks something like this:
X Y Z Target
1 a b c X
2 d e f Y
3 g h i Z
我想要看起来像这样的东西:
And I want something that looks like this:
X Y Z Target TargetValue
1 a b c X a
2 d e f Y e
3 g h i Z i
其中每个TargetValue是值由Target指定的列确定。我一直在使用dplyr来使其正常工作。如果我知道如何将粘贴的输出作为mutate的输入,那会很棒,
Where each TargetValue is the value determined by the column specified by Target. I've been using dplyr a bit to get this to work. If I knew how to make the output of paste the input for mutate that would be great,
mutate(TargetWordFixed = (paste("WordMove",TargetWord,".rt", sep="")))
但也许在那里是做同样事情的另一种方法。
but maybe there is another way to do the same thing.
请保持谨慎,我对stackoverflow和R都是陌生的。
Be gentle, I'm new to both stackoverflow and R...
推荐答案
矢量化的方法是使用矩阵子集:
A vectorized approach would be to use matrix subsetting:
df %>% mutate(TargetValue = .[cbind(1:n(), match(Target, names(.)))])
# X Y Z Target TargetValue
#1 a b c X a
#2 d e f Y e
#3 g h i Z i
或仅使用基数R(相同方法)
Or just using base R (same approach):
transform(df, TargetValue = df[cbind(1:nrow(df), match(Target, names(df)))])
说明:
-
match(Target,names(。))
计算实体的列索引Target中的s(该列称为X等) - dplyr版本中的
。
指的是管道到的数据带有%>%
的mutate语句(即它指的是df
) -
df [cbind(1:n(),match(Target,names (df))]
创建一个矩阵以将df子集转换为正确的值-矩阵的第一列就是从1到df的行数(因此1:nrow(df)
),矩阵的第二列是索引,该列保存感兴趣的目标值(由match(Target,names(df ))
)。
match(Target, names(.))
computes the column indices of the entries in Target (which column is called X etc)- The
.
in the dplyr version refers to the data you "pipe" into the mutate statement with%>%
(i.e. it refers todf
) df[cbind(1:n(), match(Target, names(df))]
creates a matrix to subset df to the correct values - the first column of the matrix is just the row numbers starting from 1 to the number of rows of df (therefore1:nrow(df)
) and the second column in the matrix is the index which column holds the Target value of interest (computed bymatch(Target, names(df))
).
为子集示例数据而生成的矩阵为:
The matrix that is produced for subsetting the example data is:
cbind(1:nrow(df), match(df$Target, names(df)))
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3 3
这篇关于如何使用一个列来确定从哪里获取另一列的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!