从另一个向量中删除向量中第一次出现的元素 [英] Remove first occurrence of elements in a vector from another vector

查看:95
本文介绍了从另一个向量中删除向量中第一次出现的元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符向量,其中包括一些重复的元素,例如

v <- c("d09", "d11", "d13", "d01", "d02", "d10", "d13")

另一个包含单个字符计数的向量,例如

x <- c("d10", "d11", "d13")

我只想从第二个矢量v中删除x中每个元素的 first 出现.在此示例中,d13出现在x中,两次出现在v中,但是仅从v中删除了第一个匹配项,并保留了重复项.因此,我想结束:

"d09", "d01", "d02", "d13"

我一直在尝试各种事情,例如z <- v[!(v %in% x)],但它会不断删除x中字符的所有个实例,而不仅仅是第一个,因此我最终得到了这个:

"d09", "d01", "d02"

我该怎么做才能仅删除一个重复元素的实例?

解决方案

您可以使用match和负索引.

v[-match(x, v)]

产生

[1] "d09" "d01" "d02" "d13"

match仅返回值的第一个匹配项的位置,在这里我们利用它来发挥优势.

请注意,%in%is.elementmatch的简并版本.比较:

match(x, v)            # [1] 6 2 3
match(x, v) > 0        # [1] TRUE TRUE TRUE
x %in% v               # [1] TRUE TRUE TRUE
is.element(x, v)       # [1] TRUE TRUE TRUE

最后三个都是相同的,并且基本上是第一个的逻辑版本(实际上,请参见%in%is.element的代码).这样做会丢失键信息,键信息是vx的第一个匹配项的位置,并且仅知道v中存在x值才被留下.

相反,v %in% x表示与您想要的东西不同的东西,即"v中的值位于x中",由于所有重复的值都满足该条件,因此不符合您的要求./p>

I have a character vector, including some elements that are duplicates e.g.

v <- c("d09", "d11", "d13", "d01", "d02", "d10", "d13")

And another vector that includes single counts of those characters e.g.

x <- c("d10", "d11", "d13")

I want to remove only the first occurrence of each element in x from the 2nd vector v. In this example, d13 occurs in x and twice in v, but only the first match is removed from v and the duplicate is kept. Thus, I want to end up with:

"d09", "d01", "d02", "d13"

I've been trying various things e.g. z <- v[!(v %in% x)] but it keeps removing all instances of the characters in x, not just the first, so I end up with this instead:

"d09", "d01", "d02"

What can I do to only remove one instance of a duplicated element?

解决方案

You can use match and negative indexing.

v[-match(x, v)]

produces

[1] "d09" "d01" "d02" "d13"

match only returns the location of the first match of a value, which we use to our advantage here.

Note that %in% and is.element are degenerate versions of match. Compare:

match(x, v)            # [1] 6 2 3
match(x, v) > 0        # [1] TRUE TRUE TRUE
x %in% v               # [1] TRUE TRUE TRUE
is.element(x, v)       # [1] TRUE TRUE TRUE

The last three are all the same, and are basically the coerced to logical version of the first (in fact, see code for %in% and is.element). In doing so you lose key information, which is the location of the first match of x in v and are left only knowing that x values exist in v.

The converse, v %in% x means something different from what you want, which is "which values in v are in x", which won't meet your requirement since all duplicate values will satisfy that condition.

这篇关于从另一个向量中删除向量中第一次出现的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆