如何读取/加载前导零作为字符串的yaml参数? [英] How to read/load yaml parameters with leading zeros as a string?

查看:86
本文介绍了如何读取/加载前导零作为字符串的yaml参数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何读取/加载前导零作为字符串的 YAML 参数并在 python 3.7 中操作?从使用 yaml-cpp(yaml 1.2) 的 C++ 工具中,我得到一个包含 leading_zero: 00005 的文本文件.读取/加载这行代码,好像转成int了,但是为什么呢?你知道如何处理带前导零的 YAML 字符串吗?

How to read/load YAML parameters with leading zeros as a string and manipulate in python 3.7? From a C++ Tool using yaml-cpp(yaml 1.2), I get a text file containing leading_zero: 00005. Reading/loading this line of code, seems to be converted into int, but why? Do you know how to handle YAML strings with leading zeros?

import sys
from ruamel.yaml import YAML

yaml = YAML()
inp = "leading_zero: 00005\n"
code = yaml.load(inp)
print(code)
print(code['leading_zero'])
yaml.dump(code, sys.stdout)

输出ruamel.yaml

ordereddict([('leading_zero', 5)])
5
leading_zero: 00005

正如你所看到的 00005 在ordereddict 中没有存储为字符串'00005',但是为什么yaml.dump() 显示正确的数字呢?

As you can see 00005 is not stored as string '00005' in the ordereddict, but why does yaml.dump() show the correct number then?

import yaml
inp = "leading_zero: 00005\n"
code = yaml.load(inp)
print(code)
print(yaml.dump(code, default_flow_style=False))

输出pyyaml

{'leading_zero': 5}
leading_zero: 5

推荐答案

首先,没有 YAML 字符串,有集合(映射和序列)和标量.假设这些标量没有被标记(就像你的情况一样),它们可以被引用(为简单起见,这包括文字/折叠样式)或简单的.

First of all, there are no YAML strings, there are collections (mappings and sequences) and scalars. Assuming these scalars are not tagged (as in your case), they can be quoted (for the sake of simplicity this includes literal/folded style) or plain.

在加载 YAML 文档的正常情况下,引用的标量将作为字符串加载,并且普通标量可以根据其内容"作为特殊类型进行解释.那解释可能导致它是一个布尔值、一个日期、一个浮点值.如果这些都不匹配,则将纯标量作为字符串加载.

In the normal case of loading a YAML document, a quoted scalar will be loaded as strings, and the plain scalar is open for interpretation as a special type depending on its "content". That interpretation could lead to it being a boolean, a date, a floating point value. If none of those match, the plain scalar is loaded as a string.

正常加载情况应用核心架构.该架构是一个JSON 模式的超集,并且在两个纯标量中都包含数字应该只加载为 整数.所以这回答了你关于如何处理YAML 字符串"的第一个问题

The normal loading case applies the Core Schema. That schema is a superset of the JSON schema, and in both plain scalars consisting of numbers only are supposed to be loaded as integers. So this answers your first question on how to handle "YAML strings"

ruamel.yaml,使用默认(往返)模式,如果您尝试保留 YAML 文档的特定格式加载,然后转储该文档(这并不总是可能的,但它试).虽然它加载 00005 作为一个整数,但它实际上是一个整数类的子类型,包括有关格式的信息整数(即包括前导零的数量).如果你的YAML 文档受版本控制,这些东西很好不要仅仅因为您更新了文档的其他部分而更改.

ruamel.yaml, using the default (round-trip) mode, tries to preserve the specific format your YAML document if you load, then dump that document (this is not always possible, but it tries). Although it loads 00005 as an integer, it is actually a subtype of the integer class, which includes information about the format of the integer (i.e. including the number of leading zeros). If your YAML document is under revision control, it is nice that these kind of things don't change just because you updated some other part of the document.

这应该回答你的第二个问题,为什么 ruamel.yaml 在输出中显示正确的标量.

This should answer your second question asking why ruamel.yaml shows the correct scalar on output.

PyYAML 不会这样做(如果决定使用 ruamel.yaml 也不会这样做)安全加载).你很幸运,你确实尝试过像 00005 这样的标量对于您的测试,因为 00008 将作为字符串加载(因为 PyYAML使用 2009 年之前的 YAML 1.1 规范,其中前导零表示八进制,在 YAML 1.2 八进制以 0o 开头)和 00015 加载在 ruamel.yaml 中作为数字 15,在 PyYAML 中作为数字 13:

PyYAML doesn't do this (and neither does ruamel.yaml if decide to use safe loading). And you are lucky you did try a scalar like 00005 for your test, because 00008 would load as a string (since PyYAML uses the pre-2009 YAML 1.1 specification, in which a leading zero indicates octal, in YAML 1.2 octals start with 0o) and 00015 loads in ruamel.yaml as the number 15 and in PyYAML as the number 13:

import sys
import ruamel.yaml
import yaml as pyyaml

yaml_str = """\
- 00005    
- 00008    # this is not an octal in YAML 1.1
- 00015
"""

yaml = ruamel.yaml.YAML()
data = yaml.load(yaml_str)
print('ruamel.yaml:', data, type(data[0]))
yaml.dump(data, sys.stdout)
print('-----------')
data = pyyaml.load(yaml_str)
print('pyyaml:     ', data, type(data[0]))
pyyaml.dump(data, sys.stdout, default_flow_style=False)

给出:

ruamel.yaml: [5, 8, 15] <class 'ruamel.yaml.scalarint.ScalarInt'>
- 00005
- 00008    # this is not an octal in YAML 1.1
- 00015
-----------
pyyaml:      [5, '00008', 13] <class 'int'>
- 5
- 00008
- 13

我知道如何处理带有前导零的yaml 字符串"吗,不,我不,但我为您提供了多种选择,具体取决于您加载文档的目的(只是要清楚:我是ruamel.yaml 的作者).

Do I know how to handle '"yaml strings" with leading zeros', No I don't, but I give you you several options to select from, depending on the purpose with which you load your document (just to be clear: I am the author of ruamel.yaml).

  • 在默认的往返模式下,我加载它们的行为类似于整数,但保留外部特定于输入的外观,因为 YAML 提供了无穷无尽的呈现任何特定数字(如 5)的多种方式.

如果你加载执行 yaml = YAML(typ='safe') 后,您将只拥有简单的整数,不会转储前导零.

If you load after doing yaml = YAML(typ='safe') you will just have plain integers, that don't dump with leading zeros.

如果你在做`yaml = YAML(typ='base') 之后加载,你会得到 baseloader 和每个标量加载为字符串

If you load after doing `yaml = YAML(typ='base'), you will get the baseloader and every scalar loads as a string

作为程序:

from ruamel.yaml import YAML

for t in ['rt', 'safe', 'base']:   # 'rt' is the default
    data = YAML(typ=t).load("00005")
    dt = type(data)
    print(f'{t:5}  {data!r:7}  {dt}')

给出:

rt     5        <class 'ruamel.yaml.scalarint.ScalarInt'>
safe   5        <class 'int'>
base   '00005'  <class 'str'>

因此,如果您不喜欢往返模式的神奇整数",请加载基数架构并自己处理从 YAML 标量加载的结果字符串.替代因为那将是从安全或往返模式卸载整数匹配正则表达式,但那更复杂.

so if you don't like the magic "integers" of the round-trip mode, load with the base schema and process the resulting string loaded from the YAML scalar yourself. An alternative for that would be to unload the integer-matching regular expression from safe or round-trip mode, but that is more complex.

这篇关于如何读取/加载前导零作为字符串的yaml参数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆