反序列化一个 YAML “表"数据的 [英] Deserialize a YAML "Table" of data

查看:24
本文介绍了反序列化一个 YAML “表"数据的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 yamldotnet 和 c# 来反序列化由第三方软件应用程序创建的文件.以下 YAML 文件示例均来自应用程序:

I am using yamldotnet and c# to deserialize a file created by a third party software application. The following YAML file examples are both valid from the application:

#File1
Groups:
  - Name: ATeam
    FirstName, LastName, Age, Height:
      - [Joe, Soap, 21, 184]
      - [Mary, Ryan, 20, 169]
      - [Alex, Dole, 24, 174]

#File2
Groups:
  - Name: ATeam
    FirstName, LastName, Height:
      - [Joe, Soap, 184]
      - [Mary, Ryan, 169]
      - [Alex, Dole, 174]

注意 File2 没有任何 Age 列,但解串器仍然必须识别出每行的第三个值是高度而不是年龄.这个数据应该代表一张人表.以 File1 为例,Mary Ryan 20 岁,身高 169 厘米.解串器需要了解它所拥有的列(对于 File2,它只有名字、姓氏和高度)并将数据相应地存储在正确的对象中:Mary Ryan 身高 169 厘米.

Notice that File2 doesnt have any Age column but the deserializer must still recognise that the third value on each line is a height rather than an age. This data is supposed to represent a table of people. In the case of File1 for example, Mary Ryan is age 20 and is 169cm tall. The deserializer needs to understand the columns it has (for File2 it only has FirstName, LastName and Height) and store the data accordingly in the right objects : Mary Ryan is 169cm tall.

同样,程序文档指出列的顺序并不重要,因此下面的 File3 是一种同样有效的方式来表示 File2 中的数据,即使现在高度是第一位的:

Similarly the program documentation states that the order of the columns is not important so File3 below is an equally valid way to represent the data in File2 even though Height is now first:

#File3
Groups:
 - Name: ATeam
   Height, FirstName, LastName:
      - [184, Joe, Soap]
      - [169, Mary, Ryan]
      - [174, Alex, Dole]

我有很多问题:

  1. 这是标准的 YAML 吗?- 我找不到关于使用的任何信息同一行上的多个键后跟一个冒号和列表表示数据表的值.
  2. 我将如何使用 yamldotnet 反序列化它?在那里我可以进行修改以帮助它吗?
  3. 如果我不能使用 yamldotnet,我该怎么办?

推荐答案

正如其他答案所述,这是有效的 YAML.但是,文档的结构是特定于应用程序的,并没有使用 YAML 的任何特殊功能来表达表格.

As other answers stated, this is valid YAML. However, the structure of the document is specific to the application, and does not use any special feature of YAML to express tables.

您可以使用 YamlDotNet 轻松解析此文档.但是,您将遇到两个困难.第一个是,由于列的名称放在键内,您将需要使用一些自定义序列化代码来处理它们.第二个是您需要实现某种抽象才能以表格方式访问数据.

You can easily parse this document using YamlDotNet. However you will run into two difficulties. The first is that, since the names of the columns are placed inside the key, you will need to use some custom serialization code to handle them. The second is that you will need to implement some kind of abstraction to be able to access the data in a tabular way.

我已经提供了一个概念证明,将说明如何解析和读取数据.

I have put-up a proof of concept that will illustrate how to parse and read the data.

首先,创建一个类型来保存来自 YAML 文档的信息:

First, create a type to hold the information from the YAML document:

public class Document
{
    public List<Group> Groups { get; set; }
}

public class Group
{
    public string Name { get; set; }

    public IEnumerable<string> ColumnNames { get; set; }

    public IList<IList<object>> Rows { get; set; }
}

然后实现IYamlTypeConverter来解析Group类型:

public class GroupYamlConverter : IYamlTypeConverter
{
    private readonly Deserializer deserializer;

    public GroupYamlConverter(Deserializer deserializer)
    {
        this.deserializer = deserializer;
    }

    public bool Accepts(Type type)
    {
        return type == typeof(Group);
    }

    public object ReadYaml(IParser parser, Type type)
    {
        var group = new Group();

        var reader = new EventReader(parser);
        do
        {
            var key = reader.Expect<Scalar>();
            if(key.Value == "Name")
            {
                group.Name = reader.Expect<Scalar>().Value;
            }
            else
            {
                group.ColumnNames = key.Value
                    .Split(',')
                    .Select(n => n.Trim())
                    .ToArray();

                group.Rows = deserializer.Deserialize<IList<IList<object>>>(reader);
            }
        } while(!reader.Accept<MappingEnd>());
        reader.Expect<MappingEnd>();

        return group;
    }

    public void WriteYaml(IEmitter emitter, object value, Type type)
    {
        throw new NotImplementedException("TODO");
    }
}

最后,将转换器注册到反序列化器并反序列化文档:

Last, register the converter into the deserializer and deserialize the document:

var deserializer = new Deserializer();
deserializer.RegisterTypeConverter(new GroupYamlConverter(deserializer));

var document = deserializer.Deserialize<Document>(new StringReader(yaml));

您可以在此处测试完整工作的示例

这只是一个概念证明,但它应该作为您自己实施的指南.可以改进的地方包括:

This is only a proof of concept, but it should serve as a guideline for you own implementation. Things that could be improved include:

  • 检查和处理无效文件.
  • 改进Group 类.也许让它不可变,并添加一个索引器.
  • 如果需要序列化支持,则实现 WriteYaml 方法.
  • Checking for and handling invalid documents.
  • Improving the Group class. Maybe make it immutable, and also add an indexer.
  • Implementing the WriteYaml method if serialization support is desired.

这篇关于反序列化一个 YAML “表"数据的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆