从 fasta 文件打印序列 [英] Printing a sequence from a fasta file

查看：23 发布时间：2022/1/6 14:03:29 bash grep fasta

本文介绍了从 fasta 文件打印序列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我经常需要在 fasta 文件中找到特定的序列并打印出来.对于那些不知道的人来说，fasta 是一种用于生物序列(DNA、蛋白质等)的文本文件格式.这很简单，你有一行序列名称前面有一个>"，然后直到下一个>"的所有行都是序列本身.例如:

I often need to find a particular sequence in a fasta file and print it. For those who don't know, fasta is a text file format for biological sequences (DNA, proteins, etc.). It's pretty simple, you have a line with the sequence name preceded by a '>' and then all the lines following until the next '>' are the sequence itself. For example:

>sequence1
ACTGACTGACTGACTG
>sequence2
ACTGACTGACTGACTG
ACTGACTGACTGACTG
>sequence3
ACTGACTGACTGACTG

我目前获得所需序列的方法是将 grep 与 -A 一起使用，所以我会这样做

The way I'm currently getting the sequence I need is to use grep with -A, so I'll do

grep -A 10 sequence_name filename.fa

然后如果我在文件中没有看到下一个序列的开始，我会将 10 更改为 20 并重复，直到我确定我得到了整个序列.

and then if I don't see the start of the next sequence in the file, I'll change the 10 to 20 and repeat until I'm sure I'm getting the whole sequence.

似乎应该有更好的方法来做到这一点.例如，我可以要求它一直打印到下一个 '>' 字符吗?

It seems like there should be a better way to do this. For example, can I ask it to print up until the next '>' character?

从 fasta 文件打印序列 [英] Printing a sequence from a fasta file

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从 fasta 文件打印序列 [英] Printing a sequence from a fasta file

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭