如何使一个C#'grep'可以更多的功能使用LINQ? [英] How to make a C# 'grep' more Functional using LINQ?

查看:261
本文介绍了如何使一个C#'grep'可以更多的功能使用LINQ?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有执行跨文件一个简单的'grep'可以使用搜索字符串的一个枚举的方法。 (实际上,我在做一个很天真查找所有引用)

 的IEnumerable<串GT; searchStrings = GetSearchStrings(); 
IEnumerable的<串GT; filesToLookIn =的GetFiles();
Multimap之<字符串,字符串>引用=新的Multimap之<字符串,字符串>();

的foreach(在filesToLookIn字符串文件名)
{
的foreach(在File.ReadAllLines串线(文件名))
{
的foreach(字符串搜索字符串searchStrings)
{
如果(line.Contains(搜索字符串))
{
references.AddIfNew(搜索字符串,文件名);
}
}
}
}

请注意: Multimap之< TKEY的,TValue> 是大致相同词典< TKEY的,列表与LT; TValue>> ,只避免你通常会遇到的NullReferenceException异常。






我一直在努力把这个。融入了更多功能性风格,采用链式LINQ扩展方法,但还没有想通了



一个死胡同的尝试:

  //我迷路了如何循环这里... 
中做一个循环//加,我失去跟踪的文件名的
VAR线= filesToLookIn.Select(F => File.ReadAllLines(F))。凡(//?

和另一个(希望保存的文件名这一次):

  VAR filesWithLines = 
filesToLookIn
。选择(F =>新建{文件名= F,行= File.ReadAllLines(F)});

变种matchingSearchStrings =
searchStrings
。凡(SS => filesWithLines.Any(
FWL = GT; fwl.Lines.Any(L => l.Contains(β))));



但我似乎仍然输了,我需要的信息。



也许我只是从错误的角度接近这一点?从性能的观点来看,该环应该在大致执行相同的顺序与原始示例



的如何做到这一点的一个更紧凑的功能表示任何想法<? / p>

解决方案

如何:

  VAR匹配= 
从filename在filesToLookIn
从线File.ReadAllLines(文件名)从搜索字符串
在searchStrings
,其中line.Contains(搜索字符串)
选择新
{
文件名=文件名,
SearchString在=搜索字符串
};

的foreach(在比赛中VAR比赛)
{
references.AddIfNew(match.SearchString,match.FileName);
}



编辑:



在概念上,查询变成每个文件名到一组线,然后交叉连接该集线来设定搜索字符串(这意味着每行与每一个搜索字符串配对)的。这组被过滤匹配的行,并选择每行的相关信息。



多重条款类似于嵌套的的foreach 语句。每个指示前一范围的新的迭代。多个条款翻译成的 的SelectMany 方法,它选择从每个元素的顺序和结果序列合并为一个序列。



所有C#的查询语法转换为扩展方法。但是,编译器确实采用了一些技巧。一种是使用匿名类型。每当2+范围变量在同一范围内,他们很可能是幕后匿名类型的一部分。这允许范围内的任意数据量通过推广方法,如选择其中,,它有固定的参数数量流动。请参见这个帖子。进一步的细节。



下面是上面查询的扩展方法翻译:

  VAR匹配= filesToLookIn 
.SelectMany(
文件名= GT; File.ReadAllLines(文件名),
(文件名,行)=>新建{文件名,线})
.SelectMany(
anon1 => searchStrings,
(anon1,搜索字符串)=>新建{anon1,搜索字符串})
。凡(anon2 => anon2.anon1.line.Contains(anon2.searchString))
。选择(anon2 =>新建
{
=文件名anon2.anon1.fileName,
SearchString在= anon2。搜索字符串
});


I have a method that performs a simplistic 'grep' across files, using an enumerable of "search strings". (Effectively, I'm doing a very naive "Find All References")

IEnumerable<string> searchStrings = GetSearchStrings();
IEnumerable<string> filesToLookIn = GetFiles();
MultiMap<string, string> references = new MultiMap<string, string>();

foreach( string fileName in filesToLookIn )
{
    foreach( string line in File.ReadAllLines( fileName ) )
    {
        foreach( string searchString in searchStrings )
        {
            if( line.Contains( searchString ) )
            {
                references.AddIfNew( searchString, fileName );
            }
        }
    }
}

Note: MultiMap<TKey,TValue> is roughly the same as Dictionary<TKey,List<TValue>>, just avoiding the NullReferenceExceptions you'd normally encounter.


I have been trying to put this into a more "functional" style, using chained LINQ extension methods but haven't figured it out.

One dead-end attempt:

// I get lost on how to do a loop within a loop here...
// plus, I lose track of the file name
var lines = filesToLookIn.Select( f => File.ReadAllLines( f ) ).Where( // ???

And another (hopefully preserving the file name this time):

var filesWithLines =
    filesToLookIn
        .Select(f => new { FileName = f, Lines = File.ReadAllLines(f) });

var matchingSearchStrings =
    searchStrings
        .Where(ss => filesWithLines.Any(
                         fwl => fwl.Lines.Any(l => l.Contains(ss))));

But I still seem to lose the information I need.

Maybe I'm just approaching this from the wrong angle? From a performance standpoint, the loops ought to perform in roughly the same order as the original example.

Any ideas of how to do this in a more compact functional representation?

解决方案

How about:

var matches =
    from fileName in filesToLookIn
    from line in File.ReadAllLines(fileName)
    from searchString in searchStrings
    where line.Contains(searchString)
    select new
    {
        FileName = fileName,
        SearchString = searchString
    };

    foreach(var match in matches)
    {
        references.AddIfNew(match.SearchString, match.FileName);
    }

Edit:

Conceptually, the query turns each file name into a set of lines, then cross-joins that set of lines to the set of search strings (meaning each line is paired with each search string). That set is filtered to matching lines, and the relevant information for each line is selected.

The multiple from clauses are similar to nested foreach statements. Each indicates a new iteration in the scope of the previous one. Multiple from clauses translate into the SelectMany method, which selects a sequence from each element and flattens the resulting sequences into one sequence.

All of C#'s query syntax translates to extension methods. However, the compiler does employ some tricks. One is the use of anonymous types. Whenever 2+ range variables are in the same scope, they are probably part of an anonymous type behind the scenes. This allows arbitrary amounts of scoped data to flow through extension methods like Select and Where, which have fixed numbers of arguments. See this post for further details.

Here is the extension method translation of the above query:

var matches = filesToLookIn
    .SelectMany(
        fileName => File.ReadAllLines(fileName),
        (fileName, line) => new { fileName, line })
    .SelectMany(
        anon1 => searchStrings,
        (anon1, searchString) => new { anon1, searchString })
    .Where(anon2 => anon2.anon1.line.Contains(anon2.searchString))
    .Select(anon2 => new
    {
        FileName = anon2.anon1.fileName,
        SearchString = anon2.searchString
    });

这篇关于如何使一个C#'grep'可以更多的功能使用LINQ?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆