字符串分离 C# [英] String separation C#

查看:21
本文介绍了字符串分离 C#的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读 txt 文件,如下所示.我正在尝试将这些数据分成几个不同的列.

命令从 hmi(0).ctq[0] 发送到 calh(1).ctq[0] v:1,命令从 ptov(21) 发送到 bo(1).ctq[10] v:0,从 bo(1) 到 ptov(21) 代码的命令答案:15 - 完成,事件 ptof(1).sgn[7] v:0 s:0601,命令从 ptuf(1) 发送到 bo(1).ctq[5] v:0,

我能够以事件"开头来区分行.我是这样做的.这很容易,因为它在每个重要部分之后都有空格字符.

Listdescription = list.Select(x => x.System_Description).ToList();数据表 dt = 新数据表();dt.Columns.Add("values");foreach(描述中的字符串项目){如果(项目[0] == 'E'){string[] _columns = items.Split(" ".ToCharArray());}别的{}数据行行 = dt.NewRow();dt.Rows.Add(items);

在从Command"开始的这一行中,我想将其分成 4 列.第一个将只是命令",第二个我想将所有内容放在从"和到"之间.第三个将是to"之后的数据,最后一个将是带有v:..."的值.你能以某种方式帮助我吗,或者建议我该怎么做?

解决方案

我建议使用正则表达式来解析行.这是一些工作代码:

var text = @"命令从 hmi(0).ctq[0] 发送到 calh(1).ctq[0] v:1,命令从 ptov(21) 发送到 bo(1).ctq[10] v:0,从 bo(1) 到 ptov(21) 代码的命令答案:15 - 完成,事件 ptof(1).sgn[7] v:0 s:0601,命令从 ptuf(1) 发送到 bo(1).ctq[5] v:0,";var 行 = text.Split(Environment.NewLine.ToCharArray(),StringSplitOptions.RemoveEmptyEntries);var regex = new Regex(@"^(?:(?Event) (?\S+) (?\S+) (?\S+)|(?<C0>Command) (?:answer|sent) 从 (?<C1>\S+) 到 (?<C2>\S+) (?<C3>.+)),$");var 结果 = 行.Select(line => regex.Match(line)).选择(匹配 =>新的 {C0 = match.Groups["C0"].Value,C1 = match.Groups["C1"].Value,C2 = match.Groups["C2"].Value,C3 = match.Groups["C3"].Value});

结果是:

<前>C0 |C1 |C2 |C3 |--------+----------------+----------------+--------------------+命令 |hmi(0).ctq[0] |calh(1).ctq[0] |v:1 |命令 |ptov(21) |bo(1).ctq[10] |v:0 |命令 |博(1) |ptov(21) |代码:15 - 完整 |活动 |ptof(1).sgn[7] |v:0 |s:0601 |命令 |ptuf(1) |bo(1).ctq[5] |v:0 |

您没有指定如何解析 Command answer from 行,所以我冒昧地自己做出了一些决定.此外,我刚刚创建了一个 LINQ 查询,它将把这些行解析为一系列匿名对象.请参阅下面我展示如何将结果填充到 DataTable(稍微嘈杂的代码)的地方.

以下是正则表达式的一些亮点:

  1. (?Event) 是与 Event 匹配的命名组.名称为 C0(第 0 列),并且在执行匹配后,可以在 Match 对象中访问组的匹配值.

  2. (?:answer|sent) 是一个非捕获组,它将匹配 answersent 但它是什么不捕获匹配项.大部分正则表达式也由一个非捕获组组成,该组将匹配 Command 行或 Event 行.

  3. \S+ 匹配一个或多个非空白字符.

  4. 正则表达式以 ^ 开头并以 $ 结尾,确保整行匹配.

要将结果放在 DataTable 中,您可以删除匿名类型并改用此代码(替换 var result = lines 代码行中的):

var 匹配 = lines.Select(line => regex.Match(line));var dataTable = new DataTable();foreach (var columnName in new[] { "A", "B", "C", "D" })dataTable.Columns.Add(columnName);foreach(匹配中的var匹配)数据表.Rows.Add(match.Groups.Cast().Skip(1).Select(group => group.Value).ToArray());

唯一棘手的部分是 Skip(1),其中跳过匹配中的第一组.第一组是整场比赛.通过跳过,我知道剩下的四个组是 C0 到 C3,然后使用这些值创建包含行的列值的数组.

因为我不使用组名,所以它们实际上可以从正则表达式中删除.例如.(?\S+) 可以替换为 (\S+)

我只是选择了 A、B、C 和 D 作为列的随机名称.

I am readin txt file as bellow. I am trying to separate this data into few different columns.

Command sent from hmi(0).ctq[0] to calh(1).ctq[0] v:1,
Command sent from ptov(21) to bo(1).ctq[10] v:0,
Command answer from bo(1) to ptov(21) code:15 - complete,
Event ptof(1).sgn[7] v:0 s:0601,
Command sent from ptuf(1) to bo(1).ctq[5] v:0,

I am able to spearate line starting with "Event". I do it in this way. It is easy because it has whitespace character after each important part.

List<string> description = list.Select(x => x.System_Description).ToList<string>();
        DataTable dt = new DataTable();
        dt.Columns.Add("values");

        foreach(string items in description)
        {
            if (items[0] == 'E')
            {
                string[] _columns = items.Split(" ".ToCharArray());
            }
            else
            {

            }
            DataRow row = dt.NewRow();
            dt.Rows.Add(items);

In this line starting from "Command", I would like to separate it into 4 columns. First one will be just "Command", second one I want to put everything between "from" and "to". Third one will be data after "to" and last one will be value with "v:..". Can you help me somehow, or suggest how can I do it?

解决方案

I would suggest using a regular expression to parse the lines. Here is some working code:

var text = @"Command sent from hmi(0).ctq[0] to calh(1).ctq[0] v:1,
Command sent from ptov(21) to bo(1).ctq[10] v:0,
Command answer from bo(1) to ptov(21) code:15 - complete,
Event ptof(1).sgn[7] v:0 s:0601,
Command sent from ptuf(1) to bo(1).ctq[5] v:0,";

var lines = text.Split(
  Environment.NewLine.ToCharArray(),
  StringSplitOptions.RemoveEmptyEntries
);
var regex = new Regex(@"^(?:(?<C0>Event) (?<C1>\S+) (?<C2>\S+) (?<C3>\S+)|(?<C0>Command) (?:answer|sent) from (?<C1>\S+) to (?<C2>\S+) (?<C3>.+)),$");
var result = lines
  .Select(line => regex.Match(line))
  .Select(
    match => new {
      C0 = match.Groups["C0"].Value,
      C1 = match.Groups["C1"].Value,
      C2 = match.Groups["C2"].Value,
      C3 = match.Groups["C3"].Value
    }
  );

The result is:

C0      | C1             | C2             | C3                 |
--------+----------------+----------------+--------------------+
Command | hmi(0).ctq[0]  | calh(1).ctq[0] | v:1                |
Command | ptov(21)       | bo(1).ctq[10]  | v:0                |
Command | bo(1)          | ptov(21)       | code:15 - complete |
Event   | ptof(1).sgn[7] | v:0            | s:0601             |
Command | ptuf(1)        | bo(1).ctq[5]   | v:0                |

You did not specify how to parse the Command answer from line so I took the liberty to make some decision about this myself. Also, I have just created a LINQ query that will parse the lines into a sequence of anonymous objects. See below for where I show how to stuff the results into a DataTable (slightly more noisy code).

Here are some highlights of the regular expression:

  1. (?<C0>Event) is a named group that matches Event. The name is C0 (column zero) and the matched value of the group is accessible in the Match object after a match has been performed.

  2. (?:answer|sent) is a non-capturing group that will match either answer or sent but what it matches is not captured. The bulk of the regular expression is also made of up a non-capturing group that will match either the Command line or the Event line.

  3. \S+ matches one or more non-whitespace characters.

  4. Starting the regular expression with ^ and ending it with $ ensures that the entire line is matched.

To put the results in a DataTable you can drop the anonymous type and instead use this code (replace the from the var result = lines line of code):

var matches = lines.Select(line => regex.Match(line));
var dataTable = new DataTable();
foreach (var columnName in new[] { "A", "B", "C", "D" })
  dataTable.Columns.Add(columnName);
foreach (var match in matches)
  dataTable.Rows.Add(
    match.Groups.Cast<Group>().Skip(1).Select(group => group.Value).ToArray()
  );

The only tricky part is the Skip(1) where the first group in the match is skipped. The first group is the entire match. By skipping that I know that the four remaining groups are C0 to C3 and the values are then used to create the array with the column values for the row.

Since I don't use the group names they can actually be removed from the regular expression. E.g. (?<C1>\S+) can be replaced with (\S+) etc.

I just picked A, B, C and D as random names for the columns.

这篇关于字符串分离 C#的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆