如何在 R 延续的数据帧中将(标记)第一个唯一记录标记为 1，其余类似记录标记为 0 [英] How can I mark (flag) first unique record as 1 and the rest similar records as 0 in data frame in R continuation

查看：67 发布时间：2021/6/4 20:49:47 r dplyr duplicates unique mutate

本文介绍了如何在 R 延续的数据帧中将(标记)第一个唯一记录标记为 1，其余类似记录标记为 0的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要有关 R 和 dplyr 数据的帮助.我的第一个问题在这里解决了:如何在 R 的数据框中将第一个唯一记录标记(标记)为 1，其余类似记录标记为 0，但我需要改进这些数据.我使用的代码如下:

I need help with my data in R and dplyr. My first problem was solved here: How can I mark (flag) first unique record as 1 and the rest similar records as 0 in data frame in R but I need improve this data. I use code as below:

df %>% mutate(drive = +!duplicated(paste(date, adress)))

结果如下:

 jobs, date, adress, drive 
1 111 28.03    bla     1 
2 111 28.03    bla     0 
3 111 28.03    bla     0 
4 111 28.03    bla     0 
5 111 28.03    bla     0 
6 111 28.03    bla     0 
7 111 28.03    bla     0 
8 111 28.03    bla     0 
9 111 28.03    bla     0 <- 9th record of the same job
10 111 28.03    bla     0 <- 10th record of the same job
11 345 05.03    bla     1 
12 111 28.03    bla     0  
13 236 28.03    abc     1

我需要改进一下我的 dplyr 并且我的数据应该是这样的:

I need to improve a bit my dplyr and my data should be look that:

 jobs, date, adress, drive 
1 111 28.03    bla     1 
2 111 28.03    bla     0 
3 111 28.03    bla     0 
4 111 28.03    bla     0 
5 111 28.03    bla     0 
6 111 28.03    bla     0 
7 111 28.03    bla     0 
8 111 28.03    bla     0 
9 111 28.03    bla     0 <- 9th record of the same job
10 111 28.03    bla     1 <- 10th record, it should be 1 not 0. Sum of "the same jobs" above 9 give me again flag 1.
11 345 05.03    bla     1 <- new record of the job, so 1
12 111 28.03    bla     0
13 236 28.03    abc     1

所以，第一条记录给我 1，从同一份工作的 2-9 条记录给我 0，同一份工作的第 10 条记录给我 1，第 11-19 条记录给我 0 等等.

So, first record give me 1, from 2-9 record of the same job give me 0, 10th record of the same job give me again 1, 11-19th record give me 0 and etc.

推荐答案

当有多个条件需要测试时，我喜欢使用 case_when 而不是嵌套的 if_elses.它的工作原理是按顺序运行每个测试并在 ~ 之后输出第一个 TRUE 测试的部分.我在这里的最后一个测试只是 TRUE，所以前两个测试中没有发现的任何东西都会产生 0.

When there's more than one condition to test, I like to use case_when instead of nested if_elses. It works by running each test in order and outputting the part after the ~ for the first TRUE test. My last test here is just TRUE, so that anything not caught in the first two tests will produce a 0.

df %>%
  group_by(date, adress) %>%   # do these two vars define each "job"? 
  mutate(drive = case_when(
    row_number() == 1 ~ 1,
    row_number() %% 10 == 0 ~ 1,
    TRUE ~ 0)) %>%
  ungroup()

由于只有两个输出值，因此可以交替编码为

Since there are only two output values, this could alternately be coded as

df %>%
  group_by(date, adress) %>%   # do these two vars define each "job"? 
  mutate(drive = if_else(row_number() == 1 | row_number() %% 10 == 0, 1, 0)) %>%
  ungroup()

这篇关于如何在 R 延续的数据帧中将(标记)第一个唯一记录标记为 1，其余类似记录标记为 0的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在 R 延续的数据帧中将(标记)第一个唯一记录标记为 1，其余类似记录标记为 0 [英] How can I mark (flag) first unique record as 1 and the rest similar records as 0 in data frame in R continuation

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在 R 延续的数据帧中将(标记)第一个唯一记录标记为 1，其余类似记录标记为 0 [英] How can I mark (flag) first unique record as 1 and the rest similar records as 0 in data frame in R continuation

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭