查找该核苷酸序列的多个匹配 [英] Find multiple matches of this and that nucleotide sequence

查看:91
本文介绍了查找该核苷酸序列的多个匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想查找ATG ... TAG或ATG ... TAA的所有事件.我尝试了以下方法:

I want find every incident of ATG...TAG or ATG...TAA. I have tried the following:

#!/usr/bin/perl
use warnings;
use strict; 

my $file = ('ATGCCCCCCCCCCCCCTAGATGAAAAAAAAAATAAATGAAAAATAGATGCCCCCCCCCCCCCCC');

while($file =~ /((?=(ATG\w+?TAG|ATG\w+?TAA))/g){ 
    print "$1\n";           
} 

给出-

ATGCCCCCCCCCCCCCTAG
ATGAAAAAAAAAATAAATGAAAAATAG
ATGAAAAATAG

我想要-

ATGCCCCCCCCCCCCCTAG
ATGAAAAAAAAAATAA
ATGAAAAATAG

我在做什么错了?

推荐答案

您实际上非常亲密,从上面的声明中可以看出,您有两个捕获,而我认为您真的只想要一个单身我也不认为您需要提前.

You are actually very close, it appears from your statement above that you have two captures, when I think you really only want a single one; I also don't think you need the lookahead.

#!/usr/bin/perl
use warnings;
use strict;

my $file = ('ATGCCCCCCCCCCCCCTAGATGAAAAAAAAAATAAATGAAAAATAGATGCCCCCCCCCCCCCCC');

while($file =~ /(ATG\w+?TA[AG])/g){
    print "$1\n";
}

# output
# ATGCCCCCCCCCCCCCTAG
# ATGAAAAAAAAAATAA
# ATGAAAAATAG

逐行:

ATG与文字 ATG

\ w +?可以匹配一个或多个字符

\w+? optionally matches one or more characters

TA [AG]匹配文字 TAA TAG

TA[AG] matches a literal TAA or TAG

这篇关于查找该核苷酸序列的多个匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆