如何在特定列中添加双引号 [英] How to add double quotes in a specific column

查看:39
本文介绍了如何在特定列中添加双引号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在gene_id中添加双引号?

How to add double quotes to the gene_id?

我的原始文件:

##gtf-version 3
Bany_Scaf1  maker   gene    201136  207903  .   +   .   Alias "maker-Bany_Scaf1-snap-gene-2.23"; Dbxref "InterPro:IPR019774" "Pfam:PF00351"; ID Bany_03723; Name Bany_03723; Ontology_term "GO:0016714" "GO:0055114"; gene_id Bany_03723
Bany_Scaf1  maker   transcript  201136  207903  .   +   .   Alias "maker-Bany_Scaf1-snap-gene-2.23-mRNA-1"; Dbxref "InterPro:IPR019774" "Pfam:PF00351"; ID "Bany_03723-RA"; Name "Bany_03723-RA"; Ontology_term "GO:0016714" "GO:0055114"; Parent Bany_03723; _AED "0.06"; _QI "45|1|1|1|1|1|7|425|530"; _eAED "0.06"; gene_id Bany_03723; original_biotype mrna; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    201136  201304  .   +   .   ID "Bany_03723-RA:1"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    202687  202770  .   +   .   ID "Bany_03723-RA:2"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    202886  202921  .   +   .   ID "Bany_03723-RA:3"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    203004  203820  .   +   .   ID "Bany_03723-RA:4"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    206097  206223  .   +   .   ID "Bany_03723-RA:5"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    206649  206878  .   +   .   ID "Bany_03723-RA:6"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 
Bany_Scaf1  maker   exon    207304  207903  .   +   .   ID "Bany_03723-RA:7"; Parent "Bany_03723-RA"; gene_id Bany_03723; transcript_id "Bany_03723-RA" 

我希望将所有 gene_id Bany_xxxxxx 更改为 gene_id"Bany_xxxxxx" .

我尝试过:

sed -E 's#(Parent|gene_id|ID) ([0-9A-Za-z.]+)#\1 \"\2\"#g'

但是双引号添加在错误的位置,例如:

But the double quotes were added in the wrong place, like:

gene_id "Bany"_03723

我该怎么办...

推荐答案

sed

$ sed -E 's/gene_id ([^;]+)/gene_id "\1"/' file

找到下一个以; 分隔的单词,并输入gene_id.假设它们之间有空间.如果选项卡更改为 \ t

find the next word delimited with ; to gene_id and quote it. Assumes space between them. If tab change to \t

这篇关于如何在特定列中添加双引号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆