用于从电子邮件主题中删除“FWD”,“RE”等的正则表达式/代码 [英] Regex/code for removing "FWD", "RE", etc, from email subject

查看:210
本文介绍了用于从电子邮件主题中删除“FWD”,“RE”等的正则表达式/代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给我一封电子邮件主题,我想清理它,摆脱Re:,Fwd等垃圾。所以,例如,[Fwd] Re:杰克和吉尔的婚礼应该变成杰克和吉尔的婚礼。



以前有人必须这样做我希望你可以指出我正在测试正则表达式或代码。



这里有一些需要清理的例子,在此页面。该页面上的正则表达式工作相当不错,但并不完整。

  Fwd:Re:Re:Many 
Re:Re:很多
Re:Re:很多
Re :: Many
Re;许多
:诺亚 - 不应该匹配任何东西
RE--
RE::佛罗里达总统选票
[RE:(无主题)]
请求 - 不应该匹配任何
这是主题(fwd)
Re:[Fwd:]金发笑话
Re:[Fwd:[Fwd:FW:Policy]]
Re:Fwd: [fwd:FW:喝大量的水]
FW:FW:(fwd)FW:来自XYZ的警告...
FW:(Fwd)(Fwd)
Fwd:[ Fwd:[Fwd:Big,Bad Surf Moving]]
FW:[Fwd:Fw:由学龄儿童在PA(fwd)绘制)
Re:Fwd


解决方案

尝试这个(用''替换):

  /([\ [\(] *)?(RE | FWD?)*([ -  :;)\]] [:; \ ]) - ] * | $)| \] + * $ / igm 

每个主题通过自己的字符串,然后你不需要 m 修饰符;这只是为了 $ 匹配行尾,而不仅仅是字符串的结尾,用于多行字符串输入)。



看到它在动作 here



正则表达式的解释:

 ([\ [\(] *)) #开始[或(,后跟可选空格
(RE | FWD?)*#RE或FW或FWD,后跟可选空格
([ - :;)\]] [:; \\ \\]) - ] * | $)#只将其作为Re或FWD,如果后跟
#:或 - 或;或]或)或行尾
#(之后,您可以使用更多的这些符号与
#空格)
| #OR
\] + * $#匹配任何尾随的\]行
#(我们假设括号()发生在一个整个Re / Fwd
#括号[]出现在整个
#主题行)

标志。



i :不区分大小写。



g :全局匹配(匹配所有可以找到的Re / Fwd)。



m :让正则表达式中的'$'匹配多行输入的行尾,而不仅仅是结束的字符串(仅当您将所有输入主题一次性提供给正则表达式时才会相关,如果每次输入一个主题,则可以将其删除,因为行尾 结束字符串)。 / p>

Given an email subject line, I'd like to clean it up, getting rid of the "Re:", "Fwd", and other junk. So, for example, "[Fwd] Re: Jack and Jill's Wedding" should turn into "Jack and Jill's Wedding".

Someone must've done this before, so I'm hoping you can point me to battle tested regex or code.

Here are some examples of what needs to be cleaned up, found on this page. The regex on that page works fairly well, but is not completely there.

Fwd : Re : Re: Many
Re : Re: Many
Re  : : Re: Many
Re:: Many
Re; Many
: noah - should not match anything
RE--
RE: : Presidential Ballots for Florida
[RE: (no subject)]
Request - should not match anything
this is the subject (fwd)
Re: [Fwd: ] Blonde Joke
Re: [Fwd: [Fwd: FW: Policy]]
Re: Fwd: [Fwd: FW: "Drink Plenty of Water"]
FW: FW: (fwd) FW:  Warning from XYZ...
FW: (Fwd) (Fwd) 
Fwd: [Fwd: [Fwd: Big, Bad Surf Moving]]
FW: [Fwd: Fw: drawing by a school age child in PA (fwd)]
Re: Fwd

解决方案

Try this one (replace with ''):

/([\[\(] *)?(RE|FWD?) *([-:;)\]][ :;\])-]*|$)|\]+ *$/igm

(If you put each subject through as its own string then you don't need the m modifier; this is just so that $ matches end of line, not just end of string, for multiline string inputs).

See it in action here.

Explanation of regex:

([\[\(] *)?            # starting [ or (, followed by optional spaces
(RE|FWD?) *            # RE or FW or FWD, followed by optional spaces
([-:;)\]][ :;\])-]*|$) # only count it as a Re or FWD if it is followed by 
                       # : or - or ; or ] or ) or end of line
                       # (and after that you can have more of these symbols with
                       #  spaces in between)
|                      # OR
\]+ *$                 # match any trailing \] at end of line 
                       # (we assume the brackets () occur around a whole Re/Fwd
                       #  but the square brackets [] occur around the whole 
                       #  subject line)

Flags.

i: case insensitive.

g: global match (match all the Re/Fwd you can find).

m: let the '$' in the regex match end of line for a multiline input, not just end of string (only relevant if you feed in all your input subjects to the regex at once. If you feed in one subject each time then you can remove it because end of line is end of string).

这篇关于用于从电子邮件主题中删除“FWD”,“RE”等的正则表达式/代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆