字符串分离 C# [英] String separation C#
问题描述
我正在阅读 txt 文件,如下所示.我正在尝试将这些数据分成几个不同的列.
命令从 hmi(0).ctq[0] 发送到 calh(1).ctq[0] v:1,命令从 ptov(21) 发送到 bo(1).ctq[10] v:0,从 bo(1) 到 ptov(21) 代码的命令答案:15 - 完成,事件 ptof(1).sgn[7] v:0 s:0601,命令从 ptuf(1) 发送到 bo(1).ctq[5] v:0,
我能够以事件"开头来区分行.我是这样做的.这很容易,因为它在每个重要部分之后都有空格字符.
Listdescription = list.Select(x => x.System_Description).ToList();数据表 dt = 新数据表();dt.Columns.Add("values");foreach(描述中的字符串项目){如果(项目[0] == 'E'){string[] _columns = items.Split(" ".ToCharArray());}别的{}数据行行 = dt.NewRow();dt.Rows.Add(items);
在从Command"开始的这一行中,我想将其分成 4 列.第一个将只是命令",第二个我想将所有内容放在从"和到"之间.第三个将是to"之后的数据,最后一个将是带有v:..."的值.你能以某种方式帮助我吗,或者建议我该怎么做?
我建议使用正则表达式来解析行.这是一些工作代码:
var text = @"命令从 hmi(0).ctq[0] 发送到 calh(1).ctq[0] v:1,命令从 ptov(21) 发送到 bo(1).ctq[10] v:0,从 bo(1) 到 ptov(21) 代码的命令答案:15 - 完成,事件 ptof(1).sgn[7] v:0 s:0601,命令从 ptuf(1) 发送到 bo(1).ctq[5] v:0,";var 行 = text.Split(Environment.NewLine.ToCharArray(),StringSplitOptions.RemoveEmptyEntries);var regex = new Regex(@"^(?:(?Event) (?\S+) (?\S+) (?\S+)|(?<C0>Command) (?:answer|sent) 从 (?<C1>\S+) 到 (?<C2>\S+) (?<C3>.+)),$");var 结果 = 行.Select(line => regex.Match(line)).选择(匹配 =>新的 {C0 = match.Groups["C0"].Value,C1 = match.Groups["C1"].Value,C2 = match.Groups["C2"].Value,C3 = match.Groups["C3"].Value});
结果是:
<前>C0 |C1 |C2 |C3 |--------+----------------+----------------+--------------------+命令 |hmi(0).ctq[0] |calh(1).ctq[0] |v:1 |命令 |ptov(21) |bo(1).ctq[10] |v:0 |命令 |博(1) |ptov(21) |代码:15 - 完整 |活动 |ptof(1).sgn[7] |v:0 |s:0601 |命令 |ptuf(1) |bo(1).ctq[5] |v:0 |您没有指定如何解析 Command answer from
行,所以我冒昧地自己做出了一些决定.此外,我刚刚创建了一个 LINQ 查询,它将把这些行解析为一系列匿名对象.请参阅下面我展示如何将结果填充到 DataTable
(稍微嘈杂的代码)的地方.
以下是正则表达式的一些亮点:
(?
是与Event) Event
匹配的命名组.名称为C0
(第 0 列),并且在执行匹配后,可以在Match
对象中访问组的匹配值.(?:answer|sent)
是一个非捕获组,它将匹配answer
或sent
但它是什么不捕获匹配项.大部分正则表达式也由一个非捕获组组成,该组将匹配Command
行或Event
行.\S+
匹配一个或多个非空白字符.正则表达式以
^
开头并以$
结尾,确保整行匹配.
要将结果放在 DataTable
中,您可以删除匿名类型并改用此代码(替换 var result = lines
代码行中的):
var 匹配 = lines.Select(line => regex.Match(line));var dataTable = new DataTable();foreach (var columnName in new[] { "A", "B", "C", "D" })dataTable.Columns.Add(columnName);foreach(匹配中的var匹配)数据表.Rows.Add(match.Groups.Cast().Skip(1).Select(group => group.Value).ToArray());
唯一棘手的部分是 Skip(1)
,其中跳过匹配中的第一组.第一组是整场比赛.通过跳过,我知道剩下的四个组是 C0 到 C3,然后使用这些值创建包含行的列值的数组.
因为我不使用组名,所以它们实际上可以从正则表达式中删除.例如.(?
可以替换为 (\S+)
等
我只是选择了 A、B、C 和 D 作为列的随机名称.
I am readin txt file as bellow. I am trying to separate this data into few different columns.
Command sent from hmi(0).ctq[0] to calh(1).ctq[0] v:1,
Command sent from ptov(21) to bo(1).ctq[10] v:0,
Command answer from bo(1) to ptov(21) code:15 - complete,
Event ptof(1).sgn[7] v:0 s:0601,
Command sent from ptuf(1) to bo(1).ctq[5] v:0,
I am able to spearate line starting with "Event". I do it in this way. It is easy because it has whitespace character after each important part.
List<string> description = list.Select(x => x.System_Description).ToList<string>();
DataTable dt = new DataTable();
dt.Columns.Add("values");
foreach(string items in description)
{
if (items[0] == 'E')
{
string[] _columns = items.Split(" ".ToCharArray());
}
else
{
}
DataRow row = dt.NewRow();
dt.Rows.Add(items);
In this line starting from "Command", I would like to separate it into 4 columns. First one will be just "Command", second one I want to put everything between "from" and "to". Third one will be data after "to" and last one will be value with "v:..". Can you help me somehow, or suggest how can I do it?
I would suggest using a regular expression to parse the lines. Here is some working code:
var text = @"Command sent from hmi(0).ctq[0] to calh(1).ctq[0] v:1,
Command sent from ptov(21) to bo(1).ctq[10] v:0,
Command answer from bo(1) to ptov(21) code:15 - complete,
Event ptof(1).sgn[7] v:0 s:0601,
Command sent from ptuf(1) to bo(1).ctq[5] v:0,";
var lines = text.Split(
Environment.NewLine.ToCharArray(),
StringSplitOptions.RemoveEmptyEntries
);
var regex = new Regex(@"^(?:(?<C0>Event) (?<C1>\S+) (?<C2>\S+) (?<C3>\S+)|(?<C0>Command) (?:answer|sent) from (?<C1>\S+) to (?<C2>\S+) (?<C3>.+)),$");
var result = lines
.Select(line => regex.Match(line))
.Select(
match => new {
C0 = match.Groups["C0"].Value,
C1 = match.Groups["C1"].Value,
C2 = match.Groups["C2"].Value,
C3 = match.Groups["C3"].Value
}
);
The result is:
C0 | C1 | C2 | C3 | --------+----------------+----------------+--------------------+ Command | hmi(0).ctq[0] | calh(1).ctq[0] | v:1 | Command | ptov(21) | bo(1).ctq[10] | v:0 | Command | bo(1) | ptov(21) | code:15 - complete | Event | ptof(1).sgn[7] | v:0 | s:0601 | Command | ptuf(1) | bo(1).ctq[5] | v:0 |
You did not specify how to parse the Command answer from
line so I took the liberty to make some decision about this myself. Also, I have just created a LINQ query that will parse the lines into a sequence of anonymous objects. See below for where I show how to stuff the results into a DataTable
(slightly more noisy code).
Here are some highlights of the regular expression:
(?<C0>Event)
is a named group that matchesEvent
. The name isC0
(column zero) and the matched value of the group is accessible in theMatch
object after a match has been performed.(?:answer|sent)
is a non-capturing group that will match eitheranswer
orsent
but what it matches is not captured. The bulk of the regular expression is also made of up a non-capturing group that will match either theCommand
line or theEvent
line.\S+
matches one or more non-whitespace characters.Starting the regular expression with
^
and ending it with$
ensures that the entire line is matched.
To put the results in a DataTable
you can drop the anonymous type and instead use this code (replace the from the var result = lines
line of code):
var matches = lines.Select(line => regex.Match(line));
var dataTable = new DataTable();
foreach (var columnName in new[] { "A", "B", "C", "D" })
dataTable.Columns.Add(columnName);
foreach (var match in matches)
dataTable.Rows.Add(
match.Groups.Cast<Group>().Skip(1).Select(group => group.Value).ToArray()
);
The only tricky part is the Skip(1)
where the first group in the match is skipped. The first group is the entire match. By skipping that I know that the four remaining groups are C0 to C3 and the values are then used to create the array with the column values for the row.
Since I don't use the group names they can actually be removed from the regular expression. E.g. (?<C1>\S+)
can be replaced with (\S+)
etc.
I just picked A, B, C and D as random names for the columns.
这篇关于字符串分离 C#的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!