grep (bash) 多行模式 [英] grep (bash) multi-line pattern

查看：19 发布时间：2022/1/6 14:05:04 bash unix grep multiline

本文介绍了grep (bash) 多行模式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在 bash (4.3.46(1)) 中，我有一些多行的所谓 fasta 记录，其中每条记录都是由在线启动的 >name 和以下几行 DNA 序列 ([AGCTNacgtn])，这里是三个记录:

>chr1AGCTACTTTTAGGGNGGTNN>chr2TTGNACACCCTGGGGGAGTA>chr3TGACGTGGGTTCGGGTTTTT

如何使用 bash grep 获取第二条记录?在其他语言中，人们可能会使用:

>chr2
([AGCTNagctn]*
)*

在 Bash 中，我试图使用此处的想法(在其他 SO 中).这不起作用:

grep -zo '>chr2[AGCTNacgtn]+' 文件

结果应该是:

>chr2TTGNACACCCTGGGGGAGTA

解决方案

在我的系统上，这是解决方案(下面几乎是 Cyrus，即没有连接到第二个 grep . 的管道):

grep -Pzo '>chr1
[AGCTNacgtn
]+' 文件

解决方案

使用 GNU grep:

grep -Pzo '>chr2
[AGCTNacgtn
]+' 文件 |格雷普.

输出:

<前>>chr2TTGNACACCCTGGGGGAGTA

In bash (4.3.46(1)) I have some multi-line so called fasta records where each record is initiated by on line with >name and the following lines DNA sequence ([AGCTNacgtn]), here three records:

>chr1
AGCTACTTTT
AGGGNGGTNN
>chr2
TTGNACACCC
TGGGGGAGTA
>chr3
TGACGTGGGT
TCGGGTTTTT

How do I use bash grep to get the second record ? In other languages one might use:

>chr2
([AGCTNagctn]*
)*

In Bash I was trying to use the ideas from here (among other SOs). This did not work:

grep -zo '>chr2[AGCTNacgtn]+' file

Result should be:

>chr2
TTGNACACCC
TGGGGGAGTA

SOLUTION

On my system this was the solution (Almost Cyrus' below, i.e. with out the pipe to a second grep . ):

grep -Pzo '>chr1
[AGCTNacgtn
]+' file

解决方案

With GNU grep:

grep -Pzo '>chr2
[AGCTNacgtn
]+' file | grep .

Output:

>chr2
TTGNACACCC
TGGGGGAGTA

这篇关于grep (bash) 多行模式的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

grep (bash) 多行模式 [英] grep (bash) multi-line pattern

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

grep (bash) 多行模式 [英] grep (bash) multi-line pattern

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭