如何删除R中第n个分隔符之后的所有内容? [英] How to delete everything after nth delimiter in R?
问题描述
我有这个向量myvec
.我想在第二个 ':' 之后删除所有内容并获得结果.如何删除第 n 个 ':' 之后的字符串?
I have this vector myvec
. I want to remove everything after second ':' and get the result. How do I remove the string after nth ':'?
myvec<- c("chr2:213403244:213403244:G:T:snp","chr7:55240586:55240586:T:G:snp" ,"chr7:55241607:55241607:C:G:snp")
result
chr2:213403244
chr7:55240586
chr7:55241607
推荐答案
我们可以使用sub
.我们从字符串的开头匹配一个或多个不是 :
的字符 (^([^:]+
) 后跟一个 :
, 后跟一个不是 :
([^:]+
) 的字符,将它放在一个捕获组中,即括号内.我们替换为捕获组 (\\1
) 在替换中.
We can use sub
. We match one or more characters that are not :
from the start of the string (^([^:]+
) followed by a :
, followed by one more characters not a :
([^:]+
), place it in a capture group i.e. within the parentheses. We replace by the capture group (\\1
) in the replacement.
sub('^([^:]+:[^:]+).*', '\\1', myvec)
#[1] "chr2:213403244" "chr7:55240586" "chr7:55241607"
以上适用于发布的示例.对于在第 n 个分隔符之后删除的一般情况,
The above works for the example posted. For general cases to remove after the nth delimiter,
n <- 2
pat <- paste0('^([^:]+(?::[^:]+){',n-1,'}).*')
sub(pat, '\\1', myvec)
#[1] "chr2:213403244" "chr7:55240586" "chr7:55241607"
使用不同的n"进行检查
Checking with a different 'n'
n <- 3
并重复相同的步骤
sub(pat, '\\1', myvec)
#[1] "chr2:213403244:213403244" "chr7:55240586:55240586"
#[3] "chr7:55241607:55241607"
<小时>
或者另一种选择是按 :
分割,然后 paste
将 n 个组件放在一起.
Or another option would be to split by :
and then paste
the n number of components together.
n <- 2
vapply(strsplit(myvec, ':'), function(x)
paste(x[seq.int(n)], collapse=':'), character(1L))
#[1] "chr2:213403244" "chr7:55240586" "chr7:55241607"
这篇关于如何删除R中第n个分隔符之后的所有内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!