正则表达式拆分特定字模式的字符串 [英] Regex Split String at particular word pattern

查看：259 发布时间：2016/10/11 11:34:01 regex c#-4.0

本文介绍了正则表达式拆分特定字模式的字符串的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想分割一个字符串，如下所示：

 
国际复兴开发银行（NAICS：928120; SIC ：6081）世界银行（NAICS：928120; SIC：6081）

加入

 
国际复兴开发银行
世界银行

/ p>

 
国际复兴开发银行
国际复兴开发银行（SIC：6081）
国际复兴银行&Development（NAICS：928120）

添加到此

 
国际复兴开发银行

可以有任何数量的比赛。

我尝试过几个事情，使用负字符类不起作用：

  [^ \ NAICS：（\d +）; \）] +

我使用的是C＃Regex。 / p>

解决方案

如果你只是想要一个正则表达式拆分这可能工作 \（[^）] * （？:(？：SIC | NAICS）：[^）] *）+ \）

我将采用find_all正则表达式方法。

 （?! \s * $）（。*？） \\（[^）] *（？:(？：SIC | NAICS）：[^）] *）+ \）| $）
修饰符： $ b

警告，这将允许在标题中允许非（SIC：/ NAICS :)。

但是，它们不是测距仪的权利？

编辑

我的道歉。这两个正则表达式可以缩短为

\（[^）] *（？：SIC | NAICS）：[ \）

和

\s * $）（。*？）（?: \（[^）] *（？：SIC | NAICS）：[^）] * \）| $） p>

I am trying to split a string that could look like this:

International Bank for Reconstruction & Development (NAICS: 928120; SIC: 6081) World Bank (NAICS: 928120; SIC: 6081)

into this

International Bank for Reconstruction & Development
World Bank

or any of this:

International Bank for Reconstruction & Development
International Bank for Reconstruction & Development (SIC: 6081)
International Bank for Reconstruction & Development (NAICS: 928120)

into this

International Bank for Reconstruction & Development

there could be any number of matches.

I've tried a few things, using negative characters classes doesn't work:

[^\(NAICS: (\d+);\)]+

I'm using C# Regex.

解决方案

If you just want a regex to split on this might work $[^)]*(?:(?:SIC|NAICS):[^)]*)+$

You could do it without split. I would take a find_all regex approach.

(?!\s*$)(.*?)(?:\([^)]*(?:(?:SIC|NAICS):[^)]*)+\)|$)
Modifiers: s (dot allows newline) and g (global)

Be warned, this will allow non '(SIC:/NAICS:)' to be allowed in the Title.
But, they aren't the delimeter right?

edit

My apologies. Those two regexs' can be shortened to

$[^)]*(?:SIC|NAICS):[^)]*$

and

(?!\s*$)(.*?)(?:$[^)]*(?:SIC|NAICS):[^)]*$|$)

这篇关于正则表达式拆分特定字模式的字符串的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

正则表达式拆分特定字模式的字符串 [英] Regex Split String at particular word pattern

问题描述

相关文章

C#最新文章

热门教程

热门工具

登录关闭

正则表达式拆分特定字模式的字符串 [英] Regex Split String at particular word pattern

问题描述

相关文章

C#最新文章

热门教程

热门工具

登录 关闭

登录关闭