比较两个列的值并将提取的字符提取到新的diff列 [英] Compare two columns values and extract added characters to new diff column

查看:52
本文介绍了比较两个列的值并将提取的字符提取到新的diff列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在比较两列,但我确实想仅提取在以前的列值上添加的字符。我只想比较差异并将提取的字符提取到行上的先前值。查看此表,看看diff列上的预期输出应该是什么样。

I am comparing two columns and I do want to extract only characters that were added on previous column values. I only want to compare differences and extract added characters to previous value on row..Look at this table and see how expected output on diff column should look like.

dput(df)
structure(list(v1 = c("John|Alice,Mark|mercy, Austin|Silva", "Eunice|stoney, Brandon|Mary", "Apple| -Mango"),
               v2 = c("John|Alice,Mark|mercy, Austin|Silva|James |Jacy",  "NA ", "Apple| +Mango | Orange"),
               diff = c("|James |Jacy","NA", "+ |Orange")),
              class = "data.frame", row.names = c(NA,  -3L))

我尝试了这段代码,但是它为我提供了column1和column2的全部值,但我希望将新添加的字符赋予前一个字符

I have tried this code but it gives me the whole values in column1 and column2 but I want it to give the newly added characters to the previous one

library(dplyr); library(stringr)
dff <- df %>% mutate(diff = str_remove(v1,v2))


推荐答案

您只需要指定要从中拆分的正确定界符,

You just need to specify the correct delimiters to split from,

 Map(function(x, y) paste(setdiff(y, x), collapse = '| '), strsplit(df$v1, '\\||, | | -| \\+'), strsplit(df$v2, '\\||, | | -| \\+'))

#[[1]]
#[1] "James| | Jacy"

#[[2]]
#[1] "NA"

#[[3]]
#[1] "Orange"

要分配回数据框,最好使用 mapply 并简单地赋值,即

To assign back to the data frame, it is better to use mapply and simply assign, i.e.

df$diff1 <- mapply(function(x, y) paste(setdiff(y, x), collapse = '| '), strsplit(df$v1, '\\||, | | -| \\+'), strsplit(df$v2, '\\||, | | -| \\+'))

这篇关于比较两个列的值并将提取的字符提取到新的diff列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆