如何提取两个标记之间的子字符串? [英] How to extract the substring between two markers?
本文介绍了如何提取两个标记之间的子字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有一个字符串 'gfgfdAAA1234ZZZuijjk'
,我只想提取 '1234'
部分.
我只知道AAA
之前的几个字符,以及ZZZ
之后我感兴趣的部分1234
.>
使用 sed
可以用字符串做这样的事情:
echo "$STRING" |sed -e "s|.*AAA\(.*\)ZZZ.*|\1|"
这将给我 1234
结果.
如何在 Python 中做同样的事情?
解决方案
使用正则表达式 - 文档以供进一步参考
导入重新文本 = 'gfgfdAAA1234ZZZuijjk'm = re.search('AAA(.+?)ZZZ', 文字)如果米:找到 = m.group(1)# 找到:1234
或:
导入重新文本 = 'gfgfdAAA1234ZZZuijjk'尝试:found = re.search('AAA(.+?)ZZZ', text).group(1)除了属性错误:# AAA, ZZZ 在原始字符串中找不到found = '' # 应用你的错误处理# 找到:1234
Let's say I have a string 'gfgfdAAA1234ZZZuijjk'
and I want to extract just the '1234'
part.
I only know what will be the few characters directly before AAA
, and after ZZZ
the part I am interested in 1234
.
With sed
it is possible to do something like this with a string:
echo "$STRING" | sed -e "s|.*AAA\(.*\)ZZZ.*|\1|"
And this will give me 1234
as a result.
How to do the same thing in Python?
解决方案
Using regular expressions - documentation for further reference
import re
text = 'gfgfdAAA1234ZZZuijjk'
m = re.search('AAA(.+?)ZZZ', text)
if m:
found = m.group(1)
# found: 1234
or:
import re
text = 'gfgfdAAA1234ZZZuijjk'
try:
found = re.search('AAA(.+?)ZZZ', text).group(1)
except AttributeError:
# AAA, ZZZ not found in the original string
found = '' # apply your error handling
# found: 1234
这篇关于如何提取两个标记之间的子字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文