复杂两个连续单词或单个单词的正则表达式。 C# [英] Complex Regular expression for two consecutive words or a single word. C#

查看:424
本文介绍了复杂两个连续单词或单个单词的正则表达式。 C#的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个在我的数据库中的每个城市的列表,并有一个应用程序编写的C#,需要搜索一个传入字符串,以确定我的城市是否存在于该字符串。但是,我有问题找出Reg模式,因为一些城市有两个字,如旧金山。

I have a list of every city in the world in my Database, and have an application written in C# that needs to search an incoming string to determine whether any of my cities exist in that string. However, I'm having issues figuring out the Reg pattern because some cities are TWO words like "San Francisco". Thanks for any help in advance.

推荐答案

可能最简单的方法是创建一个内存中所有城市的数组c $ c> select cities from cities ),然后使用regex或简单的字符串方法来查看这些城市是否在文本中找到。

Probably the easiest way is to create an array of all your cities in memory (select name from cities) and then use regex or simple string methods to see if these cities are found in the text.

 List<string> cities = GetCitiesFromDatabase(); // need to implement this yourself
 string text = @"the text containign city names such as Amsterdam and San Francisco";

 bool containsACity = cities.Any(city => text.Contains(city)); //To search case insensitive, add StringComparison.CurrentCultureIgnoreCase
 IEnumerable<string> containedCities = cities.Where(city => text.Contains(city));

为了确保'Amsterdam'不匹配'Amsterdamned',您可以使用正则表达式而不是包含:

To ensure that 'Amsterdam' wouldn't match on 'Amsterdamned', you could use a regular expression instead of Contains:

 bool containsACity = cities.Any(city => Regex.IsMatch(text, @"\b"+Regex.Escape(city))+@"\b")
 // Add RegexOptions.IgnoreCase for case insensitive matches.
 IEnumerable<string> containedCities = cities.Where(city => Regex.IsMatch(text, @"\b"+Regex.Escape(city))+@"\b");

或者,您可以构建一个大型正则表达式来搜索任何城市并执行一次: p>

Alternatively, you can build a large regular expression to search for any city and execute that once:

 string regex = @"\b(?:" + String.Join("|", cities.Select(city => Regex.Escape(city)).ToArray()) + @")\b"
 bool containsACity = Regex.IsMatch(text, regex, RegexOptions.IgnoreCase);
 IEnumerable<string> containedCities = Regex.Matches(text, regex, RegexOptions.IgnoreCase).Cast<Match>().Select(m => m.Value);

您可以通过缓存城市列表或缓存正则表达式来提高这些调用的性能通过创建静态只读Regex对象与RegexOptions.Compiled 来进一步提高) 。

You can improve the performance of these calls by caching the list of cities or caching the regular expression (and improve even further by creating a static readonly Regex object with RegexOptions.Compiled).

另一个解决方案是在数据库中计算此值,而不是在内存中存储城市的本地列表,将输入发送到数据库并使用LIKE语句Regex在数据库内部比较城市列表和文本。根据城市的数量和文本的大小,这可能是一个更快的解决方案,但是这是否可能取决于使用的数据库。

Another solution would be to calculate this in the database, instead of storing a local list of cities in memory, send the input to the database and use a LIKE statement or Regex inside the database to compare the list of cities against the text. Depending on the number of cities and the size of the text this might be a faster solution, but whether or not this is possible depends on the database being used.

这篇关于复杂两个连续单词或单个单词的正则表达式。 C#的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆