如何使用一个列来确定从哪里获取另一列的值? [英] How can I use one column to determine where I get the value for another column?

查看:83
本文介绍了如何使用一个列来确定从哪里获取另一列的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用一列来确定将哪一列用作另一列的值。

I'm trying to use one column to determine which column to use as the value for another column It looks something like this:

     X   Y  Z   Target
1    a   b  c    X
2    d   e  f    Y
3    g   h  i    Z

我想要看起来像这样的东西:

And I want something that looks like this:

     X   Y  Z   Target  TargetValue
1    a   b  c    X           a
2    d   e  f    Y           e
3    g   h  i    Z           i

其中每个TargetValue是值由Target指定的列确定。我一直在使用dplyr来使其正常工作。如果我知道如何将粘贴的输出作为mutate的输入,那会很棒,

Where each TargetValue is the value determined by the column specified by Target. I've been using dplyr a bit to get this to work. If I knew how to make the output of paste the input for mutate that would be great,

mutate(TargetWordFixed = (paste("WordMove",TargetWord,".rt", sep="")))

但也许在那里是做同样事情的另一种方法。

but maybe there is another way to do the same thing.

请保持谨慎,我对stackoverflow和R都是陌生的。

Be gentle, I'm new to both stackoverflow and R...

推荐答案

矢量化的方法是使用矩阵子集:

A vectorized approach would be to use matrix subsetting:

df %>% mutate(TargetValue = .[cbind(1:n(), match(Target, names(.)))])
#  X Y Z Target TargetValue
#1 a b c      X           a
#2 d e f      Y           e
#3 g h i      Z           i

或仅使用基数R(相同方法)

Or just using base R (same approach):

transform(df, TargetValue = df[cbind(1:nrow(df), match(Target, names(df)))])

说明:


  • match(Target,names(。))计算实体的列索引Target中的s(该列称为X等)

  • dplyr版本中的指的是管道到的数据带有%>%的mutate语句(即它指的是 df

  • df [cbind(1:n(),match(Target,names (df))] 创建一个矩阵以将df子集转换为正确的值-矩阵的第一列就是从1到df的行数(因此 1:nrow(df)),矩阵的第二列是索引,该列保存感兴趣的目标值(由 match(Target,names(df )))。

  • match(Target, names(.)) computes the column indices of the entries in Target (which column is called X etc)
  • The . in the dplyr version refers to the data you "pipe" into the mutate statement with %>% (i.e. it refers to df)
  • df[cbind(1:n(), match(Target, names(df))] creates a matrix to subset df to the correct values - the first column of the matrix is just the row numbers starting from 1 to the number of rows of df (therefore 1:nrow(df)) and the second column in the matrix is the index which column holds the Target value of interest (computed by match(Target, names(df))).

为子集示例数据而生成的矩阵为:

The matrix that is produced for subsetting the example data is:

cbind(1:nrow(df), match(df$Target, names(df)))
     [,1] [,2]
[1,]    1    1
[2,]    2    2
[3,]    3    3

这篇关于如何使用一个列来确定从哪里获取另一列的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆