比较两个列的值并将提取的字符提取到新的diff列 [英] Compare two columns values and extract added characters to new diff column
问题描述
我正在比较两列,但我确实想仅提取在以前的列值上添加的字符。我只想比较差异并将提取的字符提取到行上的先前值。查看此表,看看diff列上的预期输出应该是什么样。
I am comparing two columns and I do want to extract only characters that were added on previous column values. I only want to compare differences and extract added characters to previous value on row..Look at this table and see how expected output on diff column should look like.
dput(df)
structure(list(v1 = c("John|Alice,Mark|mercy, Austin|Silva", "Eunice|stoney, Brandon|Mary", "Apple| -Mango"),
v2 = c("John|Alice,Mark|mercy, Austin|Silva|James |Jacy", "NA ", "Apple| +Mango | Orange"),
diff = c("|James |Jacy","NA", "+ |Orange")),
class = "data.frame", row.names = c(NA, -3L))
我尝试了这段代码,但是它为我提供了column1和column2的全部值,但我希望将新添加的字符赋予前一个字符
I have tried this code but it gives me the whole values in column1 and column2 but I want it to give the newly added characters to the previous one
library(dplyr); library(stringr)
dff <- df %>% mutate(diff = str_remove(v1,v2))
推荐答案
您只需要指定要从中拆分的正确定界符,
You just need to specify the correct delimiters to split from,
Map(function(x, y) paste(setdiff(y, x), collapse = '| '), strsplit(df$v1, '\\||, | | -| \\+'), strsplit(df$v2, '\\||, | | -| \\+'))
#[[1]]
#[1] "James| | Jacy"
#[[2]]
#[1] "NA"
#[[3]]
#[1] "Orange"
要分配回数据框,最好使用 mapply
并简单地赋值,即
To assign back to the data frame, it is better to use mapply
and simply assign, i.e.
df$diff1 <- mapply(function(x, y) paste(setdiff(y, x), collapse = '| '), strsplit(df$v1, '\\||, | | -| \\+'), strsplit(df$v2, '\\||, | | -| \\+'))
这篇关于比较两个列的值并将提取的字符提取到新的diff列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!