如何在R中以另一行的条件填充一行的值? [英] How to populate values of one row conditional of another row in R?

查看:31
本文介绍了如何在R中以另一行的条件填充一行的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我继承了一个以不寻常方式编码的数据集.我想学习一种不那么冗长的重塑方式.数据框如下所示:

I inherited a data set coded in an unusual way. I would like to learn a less verbose way of reshaping it. The data frame looks like this:

# Input.
participant  = c(rep("John",6), rep("Mary",6))
day          = c(rep(1,3), rep(2,3), rep(1,3), rep(2,3))
likes        = c("apples", "apples", "18", "apples", "apples", "7", "bananas", "bananas", "24", "bananas", "bananas", "3")
question     = rep(c(1,1,0),4)
number       = c(rep(18,3), rep(7,3), rep(24,3), rep(3,3))
df           = data.frame(participant, day, question, likes)

   participant day question   likes
1         John   1        1  apples
2         John   1        1  apples
3         John   1        0      18
4         John   2        1  apples
5         John   2        1  apples
6         John   2        0       7
7         Mary   1        1 bananas
8         Mary   1        1 bananas
9         Mary   1        0      24
10        Mary   2        1 bananas
11        Mary   2        1 bananas
12        Mary   2        0       3

如您所见,likes 列是异构的.当问题等于0时,喜欢传达的是参与者选择的数字,而不是他们喜欢的水果.所以我想在一个新列中重新编码如下:

As you can see, the column likes is heterogeneous. When question equals 0, likes conveys a number chosen by the participants, not their preferred fruit. So I would like to re-code it in a new column as follows:

   participant day question   likes number
1         John   1        1  apples     18
2         John   1        1  apples     18
3         John   1        0      18     18
4         John   2        1  apples      7
5         John   2        1  apples      7
6         John   2        0       7      7
7         Mary   1        1 bananas     24
8         Mary   1        1 bananas     24
9         Mary   1        0      24     24
10        Mary   2        1 bananas      3
11        Mary   2        1 bananas      3
12        Mary   2        0       3      3

我当前使用基本 R 的解决方案涉及对初始数据框进行子集化、创建查找表、更改列名称,然后将查找表与原始数据框合并.但这涉及几个步骤,我担心应该有一个更简单的解决方案.我认为 tidyr 可能是答案,但我不知道如何使用它在一列(喜欢)条件其他列(问题).

My current solution with base R involves subsetting the initial data frame, creating a lookup table, changing the column names and then merging the lookup table with the original data frame. But this involves several steps and I worry that there should be a simpler solution. I think that tidyr might be the answer, but I don't know how to use it to spread values in one column (likes) conditional other columns (day and question).

你有什么建议吗?非常感谢!

Do you have any suggestions? Thanks a lot!

推荐答案

使用上面的数据集,可以尝试以下操作.您按 participantday 对数据进行分组,并为每个组查找具有 question == 0 的行.

Using the data set above, you can try the following. You group your data by participant and day and look for a row with question == 0 for each group.

library(dplyr)
group_by(df, participant, day) %>%
mutate(age = as.numeric(as.character(likes[which(question == 0)])))

或者按照 alistaire 的建议,您也可以使用 grep().

Or as alistaire suggested, you can use grep() too.

group_by(df, participant, day) %>%
mutate(age = as.numeric(grep('\\d+', likes, value = TRUE)))


#   participant   day question   likes   age
#        (fctr) (dbl)    (dbl)  (fctr) (dbl)
#1         John     1        1  apples    18
#2         John     1        1  apples    18
#3         John     1        0      18    18
#4         John     2        1  apples     7
#5         John     2        1  apples     7
#6         John     2        0       7     7
#7         Mary     1        1 bananas    24
#8         Mary     1        1 bananas    24
#9         Mary     1        0      24    24
#10        Mary     2        1 bananas     3
#11        Mary     2        1 bananas     3
#12        Mary     2        0       3     3

如果你想使用data.table,你可以这样做:

If you want to use data.table, you can do:

library(data.table)
setDT(df)[, age := as.numeric(as.character(likes[which(question == 0)])),
            by = list(participant, day)]

注意

目前的数据集是一个新的.Jota 的回答适用于已删除的数据集.

The present data set is a new one. Jota's answer works for the deleted data set.

这篇关于如何在R中以另一行的条件填充一行的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆