C# Regex.Split:删除空结果 [英] C# Regex.Split: Removing empty results

查看:26
本文介绍了C# Regex.Split:删除空结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个导入数千行的应用程序,其中每一行的格式如下:

I am working on an application which imports thousands of lines where every line has a format like this:

|* 9070183020  |04.02.2011    |107222     |M/S SUNNY MEDICOS                  |GHAZIABAD                          |      32,768.00 |

我正在使用以下 Regex 将行拆分为我需要的数据:

I am using the following Regex to split the lines to the data I need:

Regex lineSplitter = new Regex(@"(?:^|*||)s*(.*?)s+(?=|)");
string[] columns = lineSplitter.Split(data);

foreach (string c in columns)
    Console.Write("[" + c + "] ");

这给了我以下结果:

[] [9070183020] [] [04.02.2011] [] [107222] [] [M/S SUNNY MEDICOS] [] [GHAZIABAD] [] [32,768.00] [|]

现在我有两个问题.
<强>1.如何删除空结果.我知道我可以使用:

Now I have two questions.
1. How do I remove the empty results. I know I can use:

string[] columns = lineSplitter.Split(data).Where(s => !string.IsNullOrEmpty(s)).ToArray();

但是是否有任何内置方法可以删除空结果?

<强>2.如何移除最后一个管道?

谢谢你的帮助.
问候,
约格什.

but is there any built in method to remove the empty results?

2. How can I remove the last pipe?

Thanks for any help.
Regards,
Yogesh.


我想我的问题有点被误解了.这从来都不是关于我怎么做.这只是关于如何通过更改上述代码中的Regex来做到这一点.


I think my question was a little misunderstood. It was never about how I can do it. It was only about how can I do it by changing the Regex in the above code.

我知道我可以通过多种方式做到这一点.我已经用上面提到的带有 Where 子句的代码和另一种方式完成了它,该方式也更快(超过两倍):

I know that I can do it in many ways. I have already done it with the code mentioned above with a Where clause and with an alternate way which is also (more than two times) faster:

Regex regex = new Regex(@"(^|*s*)|(s*|s*)");
data = regex.Replace(data, "|");

string[] columns = data.Split(new[] { '|' }, StringSplitOptions.RemoveEmptyEntries);

其次,作为测试用例,我的系统在原始方法中可以在不到 1.5 秒的时间内解析 92k+ 这样的行,在第二种方法中不到 700 毫秒,我永远不会在实际中找到超过几千情况,所以我认为我不需要考虑这里的速度.在我看来,在这种情况下考虑速度是过早优化.

Secondly, as a test case, my system can parse 92k+ such lines in less than 1.5 seconds in the original method and in less than 700 milliseconds in the second method, where I will never find more than a couple of thousand in real cases, so I don't think I need to think about the speed here. In my opinion thinking about speed in this case is Premature optimization.

我找到了第一个问题的答案:它不能用 Split 来完成,因为没有内置这样的选项.

I have found the answer to my first question: it cannot be done with Split as there is no such option built in.

仍在寻找我的第二个问题的答案.

Still looking for answer to my second question.

推荐答案

Regex lineSplitter = new Regex(@"[s**]*|[s**]*");
var columns = lineSplitter.Split(data).Where(s => s != String.Empty);

或者你可以简单地做:

string[] columns = data.Split(new char[] {'|'}, StringSplitOptions.RemoveEmptyEntries);
foreach (string c in columns) this.textBox1.Text += "[" + c.Trim(' ', '*') + "] " + "
";

不,没有选项可以像 String.Split 一样删除 RegEx.Split 的空条目.

And no, there is no option to remove empty entries for RegEx.Split as is for String.Split.

您也可以使用火柴.

这篇关于C# Regex.Split:删除空结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆