如何提取两个标记之间的子字符串? [英] How to extract the substring between two markers?

查看:63
本文介绍了如何提取两个标记之间的子字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个字符串 'gfgfdAAA1234ZZZuijjk',我只想提取 '1234' 部分.

我只知道AAA之前的几个字符,以及ZZZ之后我感兴趣的部分1234.

使用 sed 可以用字符串做这样的事情:

echo "$STRING" |sed -e "s|.*AAA\(.*\)ZZZ.*|\1|"

这将给我 1234 结果.

如何在 Python 中做同样的事情?

解决方案

使用正则表达式 - 文档以供进一步参考

导入重新文本 = 'gfgfdAAA1234ZZZuijjk'm = re.search('AAA(.+?)ZZZ', 文字)如果米:找到 = m.group(1)# 找到:1234

或:

导入重新文本 = 'gfgfdAAA1234ZZZuijjk'尝试:found = re.search('AAA(.+?)ZZZ', text).group(1)除了属性错误:# AAA, ZZZ 在原始字符串中找不到found = '' # 应用你的错误处理# 找到:1234

Let's say I have a string 'gfgfdAAA1234ZZZuijjk' and I want to extract just the '1234' part.

I only know what will be the few characters directly before AAA, and after ZZZ the part I am interested in 1234.

With sed it is possible to do something like this with a string:

echo "$STRING" | sed -e "s|.*AAA\(.*\)ZZZ.*|\1|"

And this will give me 1234 as a result.

How to do the same thing in Python?

解决方案

Using regular expressions - documentation for further reference

import re

text = 'gfgfdAAA1234ZZZuijjk'

m = re.search('AAA(.+?)ZZZ', text)
if m:
    found = m.group(1)

# found: 1234

or:

import re

text = 'gfgfdAAA1234ZZZuijjk'

try:
    found = re.search('AAA(.+?)ZZZ', text).group(1)
except AttributeError:
    # AAA, ZZZ not found in the original string
    found = '' # apply your error handling

# found: 1234

这篇关于如何提取两个标记之间的子字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆