如何删除R中第n个分隔符之后的所有内容? [英] How to delete everything after nth delimiter in R?

查看:24
本文介绍了如何删除R中第n个分隔符之后的所有内容?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个向量myvec.我想在第二个 ':' 之后删除所有内容并获得结果.如何删除第 n 个 ':' 之后的字符串?

I have this vector myvec. I want to remove everything after second ':' and get the result. How do I remove the string after nth ':'?

myvec<- c("chr2:213403244:213403244:G:T:snp","chr7:55240586:55240586:T:G:snp" ,"chr7:55241607:55241607:C:G:snp")

result
chr2:213403244   
chr7:55240586
chr7:55241607

推荐答案

我们可以使用sub.我们从字符串的开头匹配一个或多个不是 : 的字符 (^([^:]+) 后跟一个 :, 后跟一个不是 : ([^:]+) 的字符,将它放在一个捕获组中,即括号内.我们替换为捕获组 (\\1) 在替换中.

We can use sub. We match one or more characters that are not : from the start of the string (^([^:]+) followed by a :, followed by one more characters not a : ([^:]+), place it in a capture group i.e. within the parentheses. We replace by the capture group (\\1) in the replacement.

sub('^([^:]+:[^:]+).*', '\\1', myvec)
#[1] "chr2:213403244" "chr7:55240586"  "chr7:55241607" 

以上适用于发布的示例.对于在第 n 个分隔符之后删除的一般情况,

The above works for the example posted. For general cases to remove after the nth delimiter,

n <- 2
pat <- paste0('^([^:]+(?::[^:]+){',n-1,'}).*')
sub(pat, '\\1', myvec)
#[1] "chr2:213403244" "chr7:55240586"  "chr7:55241607" 

使用不同的n"进行检查

Checking with a different 'n'

n <- 3

并重复相同的步骤

sub(pat, '\\1', myvec)
#[1] "chr2:213403244:213403244" "chr7:55240586:55240586"  
#[3] "chr7:55241607:55241607"  

<小时>

或者另一种选择是按 : 分割,然后 paste 将 n 个组件放在一起.


Or another option would be to split by : and then paste the n number of components together.

n <- 2
vapply(strsplit(myvec, ':'), function(x)
            paste(x[seq.int(n)], collapse=':'), character(1L))
#[1] "chr2:213403244" "chr7:55240586"  "chr7:55241607" 

这篇关于如何删除R中第n个分隔符之后的所有内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆