使用 Python 删除子字符串 [英] Remove Sub String by using Python

查看:110
本文介绍了使用 Python 删除子字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经从论坛中提取了一些信息.这是我现在拥有的原始字符串:

I already extract some information from a forum. It is the raw string I have now:

string = 'i think mabe 124 + <font color="black"><font face="Times New Roman">but I don\'t have a big experience it just how I see it in my eyes <font color="green"><font face="Arial">fun stuff'

我不喜欢的是子串"<font color="black"><font face="Times New Roman">""<font color="green"><font face="Arial">".除了这个,我确实想保留字符串的另一部分.所以结果应该是这样的

The thing I do not like is the sub string "<font color="black"><font face="Times New Roman">" and "<font color="green"><font face="Arial">". I do want to keep the other part of string except this. So the result should be like this

resultString = "i think mabe 124 + but I don't have a big experience it just how I see it in my eyes fun stuff"

我怎么能这样做?实际上,我使用了美丽的汤从论坛中提取了上面的字符串.现在我可能更喜欢正则表达式来删除部分.

How could I do this? Actually I used beautiful soup to extract the string above from a forum. Now I may prefer regular expression to remove the part.

推荐答案

import re
re.sub('<.*?>', '', string)
"i think mabe 124 + but I don't have a big experience it just how I see it in my eyes fun stuff"

re.sub 函数采用正则表达式并用第二个参数替换字符串中的所有匹配项.在这种情况下,我们正在搜索所有标签 ('<.*?>') 并用空替换它们 ('').

The re.sub function takes a regular expresion and replace all the matches in the string with the second parameter. In this case, we are searching for all tags ('<.*?>') and replacing them with nothing ('').

?re 中用于非贪婪搜索.

The ? is used in re for non-greedy searches.

更多关于 re 模块.

这篇关于使用 Python 删除子字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆