如何从一行中提取一对标记字符串(Python) [英] How to extract a couple marked strings from a line (python)
问题描述
我的朋友,
我花了很多时间在这上面……但是还没有找到更好的方法来做到这一点.顺便说一下,我在用python编码.
I spent quite some time on this one... but cannot yet figure out a better way to do it. I am coding in python, by the way.
因此,这是我正在使用的文件中的一行文本,例如:
So, here is a line of text in a file I am working with, for example:
> ref | ZP_01631227.1 | 3-脱氢奎宁酸合酶[Spodgenia spumigena CCY9414] ..."
">ref|ZP_01631227.1| 3-dehydroquinate synthase [Nodularia spumigena CCY9414]..."
如何从该行中提取两个字符串"ZP_01631227.1"和"spodogenus spumigena CCY9414"?
How can I extract the two strings "ZP_01631227.1" and "Nodularia spumigena CCY9414" from the line?
成对的"| |"和方括号就像标记,所以我们知道我们想让字符串介于两者之间.
The pairs of "| |" and brackets are like markers so we know we want to get the strings in between the two...
我想我可能可以遍历该行中的所有字符,并且很难做到这一点.这只花了很多时间...想知道是否有python库或其他精巧的方法可以很好地做到这一点?
I guess I can probably loop over all the characters in the line and do it the hard way. It just takes so much time... Wondering if there is a python library or other smart ways to do it nicely?
谢谢大家!
推荐答案
>>> for line in open("file"):
... if "|" in line:
... whatiwant_1=line.split("|")[1]
... if "[" in line:
... whatiwant_2=line.split("[")[1].split("]")[0]
...
>>> print whatiwant_1 , whatiwant_2
ZP_01631227.1 Nodularia spumigena CCY9414
这篇关于如何从一行中提取一对标记字符串(Python)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!