R-按变量分组，然后分配唯一的ID [英] R - Group by variable and then assign a unique ID

查看：84 发布时间：2020/10/26 2:30:06 r dplyr

本文介绍了R-按变量分组，然后分配唯一的ID的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有兴趣用时间固定值和时变值来取消识别敏感数据集。我想（a）将所有案件按社会保险号分组，（b）为这些案件分配唯一的ID，然后（c）删除社会保险号。

I am interested in de-identifying a sensitive data set with both time-fixed and time-variant values. I want to (a) group all cases by social security number, (b) assign those cases a unique ID and then (c) remove the social security number.

数据集示例：

personal_id    gender  temperature
111-11-1111      M        99.6
999-999-999      F        98.2
111-11-1111      M        97.8
999-999-999      F        98.3
888-88-8888      F        99.0
111-11-1111      M        98.9

任何解决方案将不胜感激。

Any solutions would be very much appreciated.

推荐答案

dplyr 具有用于创建唯一组ID的 group_indices 函数

dplyr has a group_indices function for creating unique group IDs

library(dplyr)
data <- data.frame(personal_id = c("111-111-111", "999-999-999", "222-222-222", "111-111-111"),
                       gender = c("M", "F", "M", "M"),
                       temperature = c(99.6, 98.2, 97.8, 95.5))

data$group_id <- data %>% group_indices(personal_id) 
data <- data %>% select(-personal_id)

data
  gender temperature group_id
1      M        99.6        1
2      F        98.2        3
3      M        97.8        2
4      M        95.5        1

或在同一管道中（ https://github.com/tidyverse/dplyr/issues/2160）：

data %>% 
    mutate(group_id = group_indices(., personal_id))

这篇关于R-按变量分组，然后分配唯一的ID的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R-按变量分组，然后分配唯一的ID [英] R - Group by variable and then assign a unique ID

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R-按变量分组，然后分配唯一的ID [英] R - Group by variable and then assign a unique ID

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭