C#Regex.Split:删除空的结果 [英] C# Regex.Split: Removing empty results
问题描述
我的工作其中进口,其中每行有这样的格式数千行的应用程序:
| * 9070183020 | 2011年4月2日| 107222 | M / S SUNNY MEDICOS | GHAZIABAD | 32,768.00 |
我用下面的正则表达式
分裂该行的数据,我需要:
正则表达式lineSplitter =新的正则表达式(@(?:\ ^ | \ * | \ |)\s *(*)\s +(= \ |))。?;
的String [] =列lineSplitter.Split(数据);
的foreach(列中的串c)
Console.Write([+ C +]);
这是给我以下结果:
[] [9070183020] [] [2011年2月4日] [] [107222] [] [M / S SUNNY MEDICOS] [] [GHAZIABAD] [] [32,768.00] [|]
现在我有两个问题。
1。我如何删除空结果我知道我可以使用:
的String [] =列lineSplitter.Split (数据)。凡(S =>!string.IsNullOrEmpty(多个))。ToArray的();
但有方法中任何内置删除空的结果吗?
2。我怎样才能删除最后一个管?
感谢您的帮助。
的问候,
Yogesh
编辑:
我想我的问题有点误解。这是从来没有想过的我该怎么办呢 em>的。这是只有约的我怎么能在上面的代码的
更改
正则表达式
做我知道我可以在许多方面做到这一点。
$ B $:我已经用了其中,
子句和它也是(两倍以上)更快的替代方法上面提到的代码做了b 正则表达式的regex =新的正则表达式(@(^ \ | \ * \s *)|(\s * \ | \s *));
数据= regex.Replace(数据,|);
的String [] =列data.Split(新[] {'|'},StringSplitOptions.RemoveEmptyEntries);
其次,作为一个测试案例,我的系统可以解析92K +这样的线,在不到1.5秒的原来的方法并在第二个方法,在这里我将永远找不到比在现实情况下,一对夫妇一千多小于700毫秒,所以我不认为我需要在这里思考的速度。 。在我看来,在这种情况下考虑速度过早的优化
我已经找到了答案,我的第一个问题:无法与拆分完成
因为有没有内置这样的选项。
仍在寻找答案,我的第二个问题。
正则表达式lineSplitter =新的正则表达式(@[\s * \ *] * \ | [\s * \ *] *);
无功柱= lineSplitter.Split(数据)。凡(S = GT;!S =的String.Empty);
或者你可以简单地做:
的String [] =列data.Split(新的char [] {'|'},StringSplitOptions.RemoveEmptyEntries);
的foreach(列中的串c)this.textBox1.Text + =[+ c.Trim('','*')+]+\r\\\
和没有,没有选项来删除空项RegEx.Split的是String.Split
您也可以使用火柴。
I am working on an application which imports thousands of lines where every line has a format like this:
|* 9070183020 |04.02.2011 |107222 |M/S SUNNY MEDICOS |GHAZIABAD | 32,768.00 |
I am using the following Regex
to split the lines to the data I need:
Regex lineSplitter = new Regex(@"(?:^\|\*|\|)\s*(.*?)\s+(?=\|)");
string[] columns = lineSplitter.Split(data);
foreach (string c in columns)
Console.Write("[" + c + "] ");
This is giving me the following result:
[] [9070183020] [] [04.02.2011] [] [107222] [] [M/S SUNNY MEDICOS] [] [GHAZIABAD] [] [32,768.00] [|]
Now I have two questions.
1. How do I remove the empty results. I know I can use:
string[] columns = lineSplitter.Split(data).Where(s => !string.IsNullOrEmpty(s)).ToArray();
but is there any built in method to remove the empty results?
2. How can I remove the last pipe?
Thanks for any help.
Regards,
Yogesh.
EDIT:
I think my question was a little misunderstood. It was never about how I can do it. It was only about how can I do it by changing the Regex
in the above code.
I know that I can do it in many ways. I have already done it with the code mentioned above with a Where
clause and with an alternate way which is also (more than two times) faster:
Regex regex = new Regex(@"(^\|\*\s*)|(\s*\|\s*)");
data = regex.Replace(data, "|");
string[] columns = data.Split(new[] { '|' }, StringSplitOptions.RemoveEmptyEntries);
Secondly, as a test case, my system can parse 92k+ such lines in less than 1.5 seconds in the original method and in less than 700 milliseconds in the second method, where I will never find more than a couple of thousand in real cases, so I don't think I need to think about the speed here. In my opinion thinking about speed in this case is Premature optimization.
I have found the answer to my first question: it cannot be done with Split
as there is no such option built in.
Still looking for answer to my second question.
Regex lineSplitter = new Regex(@"[\s*\*]*\|[\s*\*]*");
var columns = lineSplitter.Split(data).Where(s => s != String.Empty);
or you could simply do:
string[] columns = data.Split(new char[] {'|'}, StringSplitOptions.RemoveEmptyEntries);
foreach (string c in columns) this.textBox1.Text += "[" + c.Trim(' ', '*') + "] " + "\r\n";
And no, there is no option to remove empty entries for RegEx.Split as is for String.Split.
You can also use matches.
这篇关于C#Regex.Split:删除空的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!