如何在特定标识符之前获取部件? [英] How do I get the part before a specific identifier?

查看:67
本文介绍了如何在特定标识符之前获取部件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

数据如下:



John Smith \x0d123示例St W \ x0dSampletown,Ma 02136-3739 \ x0d



我正在尝试解析姓氏,例如

我试图修改的脚本是:



String s =输入;

String s1 = s.getPart(1,);

s1.trimWhiteSpace();

String s2 = s1.getPart(0,\ x0d);

String sFullName = s2;

返回sFullName;



我之前在查找不同数据字符串中的名字时使用了这个,但数据总是与地址结构相同,可能有多个名称,所以上面我可以得到Smith \ x0d但是很明显这不是我想要的。我只想要标识符前面的内容。



帮助!谢谢

Data looks like this:

John Smith\x0d123 Sample St W\x0dSampletown, Ma 02136-3739\x0d

I'm trying to parse out the last name for example
The script I'm trying to modify is this:

String s=Input;
String s1=s.getPart(1," ");
s1.trimWhiteSpace();
String s2=s1.getPart(0,"\x0d");
String sFullName=s2;
return sFullName;

I've used this before when looking for first name in a different data string however the data is always structured the same with addresses there may be plural names, so the above I can get to Smith\x0d but obvioulsy that's not what I want. I just want what's in front of the identifier.

HELP! Thanks

推荐答案

phil.o提供了几个不错的选择。所以我只需要完成 String.Split()正则表达式的想法:



使用他的 String.Split()解决方案,他得到了数据三部分的三个字符串。

从第一部分获取姓氏:

phil.o gave several good options. So I'll just extend with "finishing up" the String.Split() and Regular expressions thoughts:

With his String.Split() solution he got to the three strings of the three parts of the data.
To get the last name from that first part:
string input = @"John Smith\x0d123 Sample St W\x0dSampletown, Ma 02136-3739\x0d";
 
string[] elements = input.Split(
   new string[] { @"\x0d" },
   StringSplitOptions.RemoveEmptyEntries
);
 
// Here elements will contain three strings:
// elements[0] == "John Smith"
// elements[1] == "123 Sample St W"
// elements[2] == "Sampletown, Ma 02136-3739"

// Continuing use of Split():
string nameWords = elements[0].Trim().Split(' ');
string lastName = nameWords[nameWords.Length-1];
//Or if you are comfortable with Linq:
string lastName = elements[0].Trim().Split(' ').Last();



就个人而言,在中。将()拆分为元素,我已经为separator参数定义了一个静态只读数组:


Personally, in the .Split() into elements, I'd have defined a static readonly array for the separator parameter:

private static readonly string[] Delimitor_x0d = new string[] { @"\x0d" };





旁白:您的数据是否包含4个字符序列 \ x0d 或者是单个具有该值的字符十六进制0d?

在单个字符的情况下,你不应该在字符串上使用 @ 前缀,并且应该使用 \\ \\ r 为此,或 \ x000d 所以,如果它尝试从源代码字符串解析(如此处的代码那样),它不会用John Smith \x0d123 Sample ... 做一些奇怪的事情,它会将该字符解析为 \ x0d12 地址3 Sample ...。此外,separator参数可以是 char [] 而不是 string []

下面,我将假设这是单个字符的情况。



使用正则表达式



Aside: does your data contain the 4 character sequence \x0d or is it the single character with the value of hex 0d?
In the single character case you should not use the @ prefix on the strings and should use \r for that, or \x000d so if it tries parsing from a source code string (as in the code here would be), it doesn't instead do strange things with "John Smith\x0d123 Sample... which would parse that character as \x0d12 at address "3 Sample...". Also, the separator parameter can be char[] instead of string[]
Below, I'm going to assume it is the single character case.

Using Regular expressions:

string input = "John Smith\r123 Sample St W\rSampletown, Ma 02136-3739\r";
Regex pattern = new Regex(@"^.*?([^ ]+)\r", RegexOptions.IgnoreCase);
var match = pattern.Match(input);
string lastName = string.Empty;
if (match.Success)
  lastName = match.Groups[1].Value;


或者你可能想要倒数第二个标识符和最后一个标识符之间的内容?



我不知道这些 getPart() trimWhiteSpace()方法是如何执行他们的工作的,因为他们是不是.NET框架的一部分。所以很难对实际问题提供帮助。



这可以用不同的方式解决。



子字符串

框架中有一些方法可以让您获得相同的行为:

String.IndexOf方法 [ ^ ]

String.LastIndexOf方法 [ ^ ]

因此,示意性地,您可以:

- 将子串从0到(最后一个标识符的索引 - 1)

- 从获得的字符串中获取子字符串(最后一个标识符的索引+ 4 - >因为4个字符的标识呃)到最后

这可能会成功。



使用String.Split()

请参阅 String.Split方法 [ ^ ]。原理图:

Or you may want what is between the penultimate identifier and the last one?

I do not know how these getPart() and trimWhiteSpace() methods are performing their job, as they are not part of the .NET framework. So it will be hard to help on the actual problem.

This can be solved by different ways.

Substrings
There are some methods in the framework that would allow you to get the same desired behaviour:
String.IndexOf Method[^]
String.LastIndexOf Method[^]
So, schematically, you could:
- take the substring from 0 to (index of last identifier - 1)
- from the string obtained, take the substring from (index of last identifier + 4 -> because of 4-characters identifier) to the end
That may do the trick.

Use String.Split()
See String.Split Method[^]. Schematically:
string input = @"John Smith\x0d123 Sample St W\x0dSampletown, Ma 02136-3739\x0d";

string[] elements = input.Split(
   new string[] { @"\x0d" },
   StringSplitOptions.RemoveEmptyEntries
);

// Here elements will contain three strings:
// elements[0] == "John Smith"
// elements[1] == "123 Sample St W"
// elements[2] == "Sampletown, Ma 02136-3739"





正则表达式

正则表达式也可能对这种情况有所帮助。如果这是一个选项,你应该看看它。



最后一件事:为了匹配你的字符串中的反斜杠,你必须逃避 @ 。例如:



Regular expressions
A regular expression could be helpful for this case, also. If that is an option, you should have a look at it.

Last thing: to match a backslash in your string, you have to escape it with another backslash, or prepend an @ to your string declaration. For example:

String s2=s1.getPart(0, "\\x0d");






OR

String s2=s1.getPart(0, @"\x0d");





祝你好运!希望这会有所帮助。



Good luck! Hope this helps.


这篇关于如何在特定标识符之前获取部件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆