查找“姓氏,名字中间名"的正则表达式;格式 [英] Regular Expression to find "lastname, firstname middlename" format

查看:197
本文介绍了查找“姓氏,名字中间名"的正则表达式;格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试查找格式"abc,def g",这是名称格式"lastname,firstname middlename".我认为最合适的方法是正则表达式,但是我对正则表达式一无所知.我尝试在正则表达式中进行一些学习,还尝试了一些表达,但是没有运气.再加一点,单词之间可能会有多个空格.

I am trying to find the format "abc, def g" which is a name format "lastname, firstname middlename". I think the best suited method is regex but I do not have any idea in Regex. I tried doing some learning in regex and tried some expression also but no luck. One additional point there may be more than one spaces between the words.

这是我尝试过的.但这是行不通的.

This is what I tried. But this is not working.

(([A-Z][,]\s?)*([A-Z][a-z]+\s?)+([A-Z]\s?[a-z]*)*)

需要帮助!任何想法我该如何做才能使上面的表达式匹配.

Need help ! Any idea how I can do this so that only the above expression match.

谢谢!

答案

最后我正在使用

([A-Za-z]+),\\s*([A-Za-z]+)\\s*([A-Za-z]+)

感谢大家的建议.

推荐答案

您的示例输入为姓氏,姓氏中间名" -可以使用以下正则表达式提取姓氏,名字和中间名(除了可能存在多个空格,而且字符串中可能同时包含大写字母和非大写字母-并且,所有部分都是必填项):

Your sample input is "lastname, firstname middlename" - with that, you can use the following regexp to extract lastname, firstname and middlename (with the addition that there might be multiple white spaces, and that there might be both capital and non-capital letters in the strings - also, all parts are mandatory):

String input = "Lastname,   firstname   middlename";
String regexp = "([A-Za-z]+),\\s+([A-Za-z]+)\\s+([A-Za-z]+)";

Pattern pattern = Pattern.compile(regexp);
Matcher matcher = pattern.matcher(input);
matcher.find();
System.out.println("Lastname  : " + matcher.group(1));
System.out.println("Firstname : " + matcher.group(2));
System.out.println("Middlename: " + matcher.group(3));

简短摘要:

([A-Za-z]+)   First capture group - matches one or more letters to extract the last name
,\\s+         Capture group is followed by a comma and one or more spaces
([A-Za-z]+)   Second capture group - matches one or more letters to extract the first name
\\s+          Capture group is followed by one or more spaces
([A-Za-z]+)   Third capture group - matches one or more letters to extract the middle name

这仅在您的名字仅包含拉丁字母的情况下才有效-可能您应该对字符使用更开放的匹配项:

This only works if your names contain latin letters only - probably you should use a more open match for the characters:

String input = "Müller,   firstname  middlename";
String regexp = "(.+),\\s+(.+)\\s+(.+)";

这与姓,名和中间名的任何字符匹配.

This matches any character for lastname, firstname and middlename.

如果空格是可选的(只有第一个出现可以是可选的,否则我们不能区分名字和中间名),则使用 * 代替 + :

If the spaces are optional (only the first occurrence can be optional, otherwise we can not distinguish between firstname and middlename), then use * instead of +:

String input = "Müller,firstname  middlename";
String regexp = "(.+),\\s*(.+)\\s+(.+)";


正如@Elliott所提到的,可能还有其他可能性,例如将 String.split() String.indexOf() String.substring()<一起使用-正则表达式通常更灵活,但更难维护,尤其是对于复杂表达式.


As @Elliott mentions, there might be other possibilities like using String.split() or String.indexOf() with String.substring() - regular expressions are often more flexible, but harder to maintain, especially for complex expressions.

在任何一种情况下,都应使用尽可能多的不同输入(包括无效输入)实施单元测试,以便您可以在修改算法后验证算法是否仍然有效.

In either case, implement unit tests with as much different inputs (including invalid ones) as possible so that you can verify that your algorithm is still valid after you modify it.

这篇关于查找“姓氏,名字中间名"的正则表达式;格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆