解析重复行的特定实例的定界数据 [英] Parsing delimited data for specific instance of repeated line

查看:68
本文介绍了解析重复行的特定实例的定界数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下格式的字符串数组,其中每个字符串都以一系列三个字符开头,指示其包含的数据类型.例如:

I have an array of strings in the following format, where each string begins with a series of three characters indicating what type of data it contains. For example:

ABC | .....
DEF | ...
RHG | 1 ........
RHG | 2 ........
RHG | 3 ........
XDF | ......

ABC|.....
DEF|...
RHG|1........
RHG|2........
RHG|3........
XDF|......

我想找到任何重复的行(在此示例中为RHG),并用特殊字符标记最后一行:

I want to find any repeating lines (RHG in this example) and mark the last line with a special character:

> RHG | 3 .........

>RHG|3.........

执行此操作的最佳方法是什么?我当前的解决方案提供了一种方法来计算行标题,并创建具有标题计数的字典.

What's the best way to do this? My current solution has a method to count the line headers and create a dictionary with the header counts.

protected Dictionary<string, int> CountHeaders(string[] lines)
{
    Dictionary<string, int> headerCounts = new Dictionary<string, int>();
    for (int i = 0; i < lines.Length; i++)
    {
        string s = lines[i].Substring(0, 3);

        int value;
        if (headerCounts.TryGetValue(s, out value))
            headerCounts[s]++;
        else
            headerCounts.Add(s, 1);
    }
    return headerCounts;
}

在主要解析方法中,我选择重复的行.

In the main parsing method, I select the lines that are repeated.

var repeats = CountHeaders(lines).Where(x => x.Value > 1).Select(x => x.Key);
foreach (string s in repeats)
{
    // Get last instance of line in lines and mark it
}

据我所知.我想我可以用另一个LINQ查询做我想做的事,但我不太确定.另外,我不禁感到有一个更好的解决方案.

This is as far as I've gotten. I think I can do what I want with another LINQ query but I'm not too sure. Also, I can't help but feel that there's a more optimal solution.

推荐答案

您可以使用LINQ来实现.

You can use LINQ to achieve that.

输入字符串:

var input = @"ABC|.....
DEF|...
RHG|1........
RHG|2........
RHG|3........
XDF|......";

LINQ查询:

var results = input.Split(new[] { Environment.NewLine })
                   .GroupBy(x => x.Substring(0, 3))
                   .Select(g => g.ToList())
                   .SelectMany(g => g.Count > 1 ? g.Take(g.Count - 1).Concat(new[] { string.Format(">{0}", g[g.Count - 1]) }) : g)
                   .ToArray();

在以后的查询步骤中,我使用Select(g => g.ToList())投影进行g.Count O(1)操作.

I used Select(g => g.ToList()) projection to make g.Count O(1) operation in further query steps.

您可以使用String.Join方法将Join结果数组分成一个字符串:

You can Join result array into one string using String.Join method:

var output = String.Join(Environment.NewLine, results);

这篇关于解析重复行的特定实例的定界数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆