解析YAML,即使在有序地图中也获得行号 [英] Parsing YAML, get line numbers even in ordered maps

查看:157
本文介绍了解析YAML,即使在有序地图中也获得行号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要获取YAML文件中某些键的行号.

I need to get the line numbers of certain keys of a YAML file.

请注意,此答案不能解决问题:我愿意使用 ruamel.yaml ,答案不适用于有序地图.

Please note, this answer does not solve the issue: I do use ruamel.yaml, and the answers do not work with ordered maps.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

from ruamel import yaml

data = yaml.round_trip_load("""
key1: !!omap
  - key2: item2
  - key3: item3
  - key4: !!omap
    - key5: item5
    - key6: item6
""")

print(data)

结果我得到了:

CommentedMap([('key1', CommentedOrderedMap([('key2', 'item2'), ('key3', 'item3'), ('key4', CommentedOrderedMap([('key5', 'item5'), ('key6', 'item6')]))]))])

除了!!omap键之外,什么都不允许访问行号:

what does not allow to access to the line numbers, except for the !!omap keys:

print(data['key1'].lc.line)  # output: 1
print(data['key1']['key4'].lc.line)  # output: 4

但是:

print(data['key1']['key2'].lc.line)  # output: AttributeError: 'str' object has no attribute 'lc'

实际上,data['key1']['key2]str.

我找到了一种解决方法:

I've found a workaround:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

from ruamel import yaml

DATA = yaml.round_trip_load("""
key1: !!omap
  - key2: item2
  - key3: item3
  - key4: !!omap
    - key5: item5
    - key6: item6
""")


def get_line_nb(data):
    if isinstance(data, dict):
        offset = data.lc.line
        for i, key in enumerate(data):
            if isinstance(data[key], dict):
                get_line_nb(data[key])
            else:
                print('{}|{} found in line {}\n'
                      .format(key, data[key], offset + i + 1))


get_line_nb(DATA)

输出:

key2|item2 found in line 2

key3|item3 found in line 3

key5|item5 found in line 5

key6|item6 found in line 6

但是看起来有点脏".有更合适的方法吗?

but this looks a little bit "dirty". Is there a more proper way of doing it?

此解决方法不仅肮脏,而且仅适用于上述一种简单情况,并且一旦存在嵌套列表,就会给出错误结果

this workaround is not only dirty, but only works for simple cases like the one above, and will give wrong results as soon as there are nested lists in the way

推荐答案

此问题不是您使用的是!omap,也不是像常规"映射一样没有行号.从您通过执行print(data['key1']['key4'].lc.line)得到4(其中key4是外部!omap中的键)的事实应该很清楚.

This issue is not that you are using !omap and that it doesn't give you the line-numbers as with "normal" mappings. That should be clear from the fact that you get 4 from doing print(data['key1']['key4'].lc.line) (where key4 is a key in the outer !omap).

答案所示,

您可以访问收藏品的属性lc

you can access the property lc on collection items

data['key1']['key4']的值是一个收集项(另一个!omap),但是data['key1']['key2']的值不是一个收集项,而是一个内置的python字符串,该字符串没有用于存储lc属性.

The value for data['key1']['key4'] is a collection item (another !omap), but the value for data['key1']['key2'] is not a collection item but a, built-in, python string, which has no slot to store the lc attribute.

要在非集合(如字符串)上获得.lc属性,您必须将RoundTripConstructor子类化,使用类似scalarstring.py中的类的类(将__slots__调整为接受lc属性然后将节点中可用的行信息传输到该属性,然后设置行,列信息:

To get an .lc attribute on a non-collection like a string you have to subclass the RoundTripConstructor, to use something like the classes in scalarstring.py (with __slots__ adjusted to accept the lc attribute and then transfer the line information available in the nodes to that attribute and then set the line, column information:

import sys
import ruamel.yaml

yaml_str = """
key1: !!omap
  - key2: item2
  - key3: item3
  - key4: !!omap
    - key5: 'item5'
    - key6: |
        item6
"""

class Str(ruamel.yaml.scalarstring.ScalarString):
    __slots__ = ('lc')

    style = ""

    def __new__(cls, value):
        return ruamel.yaml.scalarstring.ScalarString.__new__(cls, value)

class MyPreservedScalarString(ruamel.yaml.scalarstring.PreservedScalarString):
    __slots__ = ('lc')

class MyDoubleQuotedScalarString(ruamel.yaml.scalarstring.DoubleQuotedScalarString):
    __slots__ = ('lc')

class MySingleQuotedScalarString(ruamel.yaml.scalarstring.SingleQuotedScalarString):
    __slots__ = ('lc')

class MyConstructor(ruamel.yaml.constructor.RoundTripConstructor):
    def construct_scalar(self, node):
        # type: (Any) -> Any
        if not isinstance(node, ruamel.yaml.nodes.ScalarNode):
            raise ruamel.yaml.constructor.ConstructorError(
                None, None,
                "expected a scalar node, but found %s" % node.id,
                node.start_mark)

        if node.style == '|' and isinstance(node.value, ruamel.yaml.compat.text_type):
            ret_val = MyPreservedScalarString(node.value)
        elif bool(self._preserve_quotes) and isinstance(node.value, ruamel.yaml.compat.text_type):
            if node.style == "'":
                ret_val = MySingleQuotedScalarString(node.value)
            elif node.style == '"':
                ret_val = MyDoubleQuotedScalarString(node.value)
            else:
                ret_val = Str(node.value)
        else:
            ret_val = Str(node.value)
        ret_val.lc = ruamel.yaml.comments.LineCol()
        ret_val.lc.line = node.start_mark.line
        ret_val.lc.col = node.start_mark.column
        return ret_val


yaml = ruamel.yaml.YAML()
yaml.Constructor = MyConstructor

data = yaml.load(yaml_str)
print(data['key1']['key4'].lc.line)
print(data['key1']['key2'].lc.line)
print(data['key1']['key4']['key6'].lc.line)

请注意,最后一次调用print的输出为6,因为文字标量字符串以|开头.

Please note that the output of the last call to print is 6, as the literal scalar string starts with the |.

如果您还想转储data,则需要使Representer知道那些My....类型.

If you also want to dump data, you'll need to make a Representer aware of those My.... types.

这篇关于解析YAML,即使在有序地图中也获得行号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆