我可以使用LINQ从字符串中去除重复的空格吗? [英] Can I use LINQ to strip repeating spaces from a string?

查看:443
本文介绍了我可以使用LINQ从字符串中去除重复的空格吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一个快速的脑筋急转弯:给定一个字符串

 这是重复的空间
字符串

这将是LINQ表达以

 <落得code>这是重复空格的字符串

谢谢!



有关参考,在这里是一个非LINQ的方式:

 私有静态的IEnumerable<焦炭> RemoveRepeatingSpaces(IEnumerable的<焦炭>文字)
{
布尔isspace为= FALSE;
的foreach(文字变种C)
{
如果(isspace为&放大器;&安培; char.IsWhiteSpace(C))继续;

isspace为= char.IsWhiteSpace(C);
收益率的回报℃;
}
}


解决方案

由于似乎没有人给一个满意的答复,我想出了一个。下面是一个基于字符串的解决方案(.NET 4):

 公共静态字符串RemoveRepeatedSpaces(这个字符串s)
{
返回S [0] +的string.join(,
s.Zip(
s.Skip(1),
(X,Y)=> X = = Y&放大器;&安培; Y ==''(字符?)空:Y?));
}



然而,这仅仅是除去一个序列重复元素的一般情况下,所以这里的通用版本:

 公共静态的IEnumerable< T> RemoveRepeatedElements< T>(
本的IEnumerable< T> S,T DUP)
{
返回s.Take(1).Concat(
s.Zip(
s.Skip(1),
(X,Y)=> x.Equals(γ)及&安培;并且y.equals(DUP)(对象)为空:Y)
.OfType&下; T>());
}



当然,这真的只是一个功能,消除所有的更为具体的版本从输入流连续的重复:

 公共静态的IEnumerable< T> RemoveRepeatedElements< T>(这个IEnumerable的< T> S)
{
返回s.Take(1).Concat(
s.Zip(
s.Skip(1)
(X,Y)=> x.Equals(Y)(对象)空:Y)
.OfType< T>());
}

和明明可以实现在第二个方面的第一个功能:

 公共静态字符串RemoveRepeatedSpaces(这个字符串s)
{
返回的string.join(,S。 RemoveRepeatedElements(''));
}



顺便说一句,我基准我的最后一个函数对正则表达式版本( Regex.Replace(S,+,)),他们分别是彼此的纳秒之内,所以相比额外的正则表达式的开销额外LINQ开销是可以忽略不计。当我概括它删除所有重复的连续字符,等效正则表达式( Regex.Replace(S,(。)\\1 +,$ 1) )是的 3.5倍慢的比我的LINQ版本(的string.join(,s.RemoveRepeatedElements()))。



我也试过理想的程序的解决方案:

 公共静态字符串RemoveRepeatedSpaces(串S)
{
StringBuilder的SB =新的StringBuilder(s.Length);
字符lastChar ='\0';
的foreach(以秒字符C)
如果(C =''|| lastChar =!''!)
sb.Append(lastChar = C);
返回sb.ToString();
}

这是不是一个正则表达式快5倍以上!


A quick brain teaser: given a string

This  is a string with  repeating   spaces

What would be the LINQ expressing to end up with

This is a string with repeating spaces

Thanks!

For reference, here's one non-LINQ way:

private static IEnumerable<char> RemoveRepeatingSpaces(IEnumerable<char> text)
{
  bool isSpace = false;
  foreach (var c in text)
  {
    if (isSpace && char.IsWhiteSpace(c)) continue;

    isSpace = char.IsWhiteSpace(c);
    yield return c;
  }
}

解决方案

Since nobody seems to have given a satisfactory answer, I came up with one. Here's a string-based solution (.Net 4):

public static string RemoveRepeatedSpaces(this string s)
{
    return s[0] + string.Join("",
           s.Zip(
               s.Skip(1),
               (x, y) => x == y && y == ' ' ? (char?)null : y));
}

However, this is just a general case of removing repeated elements from a sequence, so here's the generalized version:

public static IEnumerable<T> RemoveRepeatedElements<T>(
                             this IEnumerable<T> s, T dup)
{
    return s.Take(1).Concat(
            s.Zip(
                s.Skip(1),
                (x, y) => x.Equals(y) && y.Equals(dup) ? (object)null : y)
            .OfType<T>());
}

Of course, that's really just a more specific version of a function that removes all consecutive duplicates from its input stream:

public static IEnumerable<T> RemoveRepeatedElements<T>(this IEnumerable<T> s)
{
    return s.Take(1).Concat(
            s.Zip(
                s.Skip(1),
                (x, y) => x.Equals(y) ? (object)null : y)
            .OfType<T>());
}

And obviously you can implement the first function in terms of the second:

public static string RemoveRepeatedSpaces(this string s)
{
    return string.Join("", s.RemoveRepeatedElements(' '));
}

BTW, I benchmarked my last function against the regex version (Regex.Replace(s, " +", " ")) and they were were within nanoseconds of each other, so the extra LINQ overhead is negligible compared to the extra regex overhead. When I generalized it to remove all consecutive duplicate characters, the equivalent regex (Regex.Replace(s, "(.)\\1+", "$1")) was 3.5 times slower than my LINQ version (string.Join("", s.RemoveRepeatedElements())).

I also tried the "ideal" procedural solution:

public static string RemoveRepeatedSpaces(string s)
{
    StringBuilder sb = new StringBuilder(s.Length);
    char lastChar = '\0';
    foreach (char c in s)
        if (c != ' ' || lastChar != ' ')
            sb.Append(lastChar = c);
    return sb.ToString();
}

This is more than 5 times faster than a regex!

这篇关于我可以使用LINQ从字符串中去除重复的空格吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆