使用数据框中的R mutate在表中生成自引用键 [英] Generate self reference key within the table using R mutate in a dataframe
问题描述
我有一个包含3列的输入表(Person_Id,Visit_Id(每次访问和每个人的唯一ID)和目的),如下所示.我想生成另一个新列,该列提供该人的前一次就诊(例如:如果某人曾以Visit Id = 2来医院就诊,那么我想再创建一个称为"Preceding_visit_Id"的列,该列将为1(例如: 2,如果访问ID = 5,则先前的访问ID将为4).是否可以使用mutate函数以一种优雅的方式做到这一点?
I have an input table with 3 columns (Person_Id, Visit_Id (unique Id for each visit and each person) and Purpose) as shown below. I would like to generate another new column which provides the immediate preceding visit of the person (ex: if person has visited hospital with Visit Id = 2, then I would like to have another column called "Preceding_visit_Id" which will be 1 (ex:2, if visit id = 5, preceding visit id will be 4). Is there a way to do this in a elegant manner using mutate function?
输入表
输出表
如您所见,"Preceding_visit_id"列引用了使用visit_id列定义的人员的上次访问
As you can see that 'Preceding_visit_id' column refers the previous visit of the person which is defined using visit_id column
请注意,这是对大型程序中某一列的转换,因此,任何优雅的做法都将有所帮助.
Please note that this is a transformation for one of the columns in a huge program, so anything elegant would be helpful.
Dput命令输出在这里
Dput command output is here
structure(list(Person_Id = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
3, 3, 3), Visit_Id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14), Purpose = c("checkup", "checkup", "checkup", "checkup",
"checkup", "checkup", "checkup", "checkup", "checkup", "checkup",
"checkup", "checkup", "checkup", "checkup"), Preceding_visit_id = c(NA,
1, 2, 3, 4, NA, 6, 7, 8, 9, 10, NA, 12, 12)), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -14L), spec =
structure(list(
cols = list(Person_Id = structure(list(), class = c("collector_double",
"collector")), Visit_Id = structure(list(), class = c("collector_double",
"collector")), Purpose = structure(list(), class =
c("collector_character",
"collector")), Preceding_visit_id = structure(list(), class =
c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1), class = "col_spec"))'''
推荐答案
示例中的Person_Id
字段不匹配.
The Person_Id
fields in your examples don't match.
我不确定这是否是您要使用的,但是我已经从您的dput()
中创建了一个删除最后一列的文件:
I'm not sure if this is what you're after, but from your dput()
I have created a file that removes the last column:
df_input <- df_output %>%
select(-Preceding_visit_id)
然后执行以下操作:
df_input %>%
group_by(Person_Id) %>%
mutate(Preceding_visit_id = lag(Visit_Id))
输出为:
# A tibble: 14 x 4
# Groups: Person_Id [3]
Person_Id Visit_Id Purpose Preceding_visit_id
<dbl> <dbl> <chr> <dbl>
1 1 1 checkup NA
2 1 2 checkup 1
3 1 3 checkup 2
4 1 4 checkup 3
5 1 5 checkup 4
6 2 6 checkup NA
7 2 7 checkup 6
8 2 8 checkup 7
9 2 9 checkup 8
10 2 10 checkup 9
11 2 11 checkup 10
12 3 12 checkup NA
13 3 13 checkup 12
14 3 14 checkup 13
这篇关于使用数据框中的R mutate在表中生成自引用键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!