如何使用 Python 更新 .yml 文件,忽略预先存在的 Jinja 语法? [英] How can I update a .yml file, ignoring preexisting Jinja syntax, using Python?

查看:22
本文介绍了如何使用 Python 更新 .yml 文件,忽略预先存在的 Jinja 语法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对一些现有的 .yml 文件进行了一些预处理 - 但是,其中一些嵌入了 Jinja 模板语法:

I have some preprocessing to do with some existing .yml files - however, some of them have Jinja template syntax embedded in them:

A:
 B:
 - ip: 1.2.3.4
 - myArray:
   - {{ jinja.variable }}
   - val1
   - val2

我想读入这个文件,并在 myArray 下添加 val3 如下:

I'd want to read in this file, and add val3 under myArray as such:

A:
 B:
 - ip: 1.2.3.4
 - myArray:
   - {{ jinja.variable }}
   - val1
   - val2
   - val 3

我尝试手动写出 jinja 模板,但它们用单引号括起来:'{{ jinja.variable }}'

I tried manually writing out the jinja templates, but they got written with single quotes around them: '{{ jinja.variable }}'

我推荐的读取此类 .yml 文件并修改它们的方法是什么,尽管使用预先存在的 Jinja 语法?我想将信息添加到这些文件中,并保持其他所有内容相同.

What's the recommended way for me to read such .yml files and modify them, albeit with preexisting Jinja syntax? I'd like to add information to these files keeping all else the same.

我在 Python 2.7+ 上使用 PyYAML 尝试了上述操作

I tried the above using PyYAML on Python 2.7+

推荐答案

此答案中的解决方案已使用插件机制合并到 ruamel.yaml 中.在这篇文章的底部有关于如何使用它的简明扼要的说明.

更新包含jinja2代码"的YAML文件需要三个方面:

There are three aspects in updating a YAML file that contains jinja2 "code":

  • 使 jinja2 代码为 YAML 解析器所接受
  • 确保可接受的可以逆转(即更改应该是唯一的,所以只有它们被逆转)
  • 保留 YAML 文件的布局,以便 jinja2 处理的更新文件仍然生成有效的 YAML 文件,该文件可以再次加载.

让我们首先通过添加 jinja2 变量定义和 for 循环并添加一些注释 (input.yaml) 来使您的示例更加真实:

Let's start by making your example somewhat more realistic by adding a jinja2 variable definition and for-loop and adding some comments (input.yaml):

# trying to update
{% set xyz = "123" }

A:
  B:
  - ip: 1.2.3.4
  - myArray:
    - {{ jinja.variable }}
    - val1
    - val2         # add a value after this one
    {% for d in data %}
    - phone: {{ d.phone }}
      name: {{ d.name }}
    {% endfor %}
    - {{ xyz }}
# #% or ##% should not be in the file and neither <{ or <<{

{% 开头的行不包含 YAML,因此我们将它们放入注释中(假设在往返过程中保留注释,见下文).由于 YAML 标量不能以 { 开头而不被引用,我们将把 {{ 更改为 <{.这是通过调用 sanitize()(它还存储使用的模式,而在 sanitize.reverse(使用存储的模式)中完成相反的代码)来完成的.

The lines starting with {% contain no YAML, so we'll make those into comments (assuming that comments are preserved on round-trip, see below). Since YAML scalars cannot start with { without being quoted we'll change the {{ to <{. This is done in the following code by calling sanitize() (which also stores the patterns used, and the reverse is done in sanitize.reverse (using the stored patterns).

最好使用 来保存您的 YAML 代码(块样式等)ruamel.yaml(免责声明:我是该包的作者),这样您就不必担心输入中的流样式元素会像使用其他答案使用的原始 default_flow_style=False.ruamel.yaml 还保留注释,包括最初在文件中的注释,以及临时插入以注释掉"以 %{ 开头的 jinja2 构造的注释.

The preservation of your YAML code (block-style etc) is best done using ruamel.yaml (disclaimer: I am the author of that package), that way you don't have to worry about flow-style elements in the input getting mangled into as block style as with the rather crude default_flow_style=False that the other answers use. ruamel.yaml also preserves comments, both the ones that were originally in the file, as well as those temporarily inserted to "comment out" jinja2 constructs starting with %{.

结果代码:

import sys
from ruamel.yaml import YAML

yaml = YAML()

class Sanitize:
    """analyse, change and revert YAML/jinja2 mixture to/from valid YAML"""
    def __init__(self):
        self.accacc = None
        self.accper = None

    def __call__(self, s):
        len = 1
        for len in range(1, 10):
            pat = '<' * len + '{'
            if pat not in s:
                self.accacc = pat
                break
        else:
            raise NotImplementedError('could not find substitute pattern '+pat)
        len = 1
        for len in range(1, 10):
            pat = '#' * len + '%'
            if pat not in s:
                self.accper = pat
                break
        else:
            raise NotImplementedError('could not find substitute pattern '+pat)
        return s.replace('{{', self.accacc).replace('{%', self.accper)

    def revert(self, s):
        return s.replace(self.accacc, '{{').replace(self.accper, '{%')


def update_one(file_name, out_file_name=None):

    sanitize = Sanitize()

    with open(file_name) as fp:
        data = yaml.load(sanitize(fp.read()))
    myArray = data['A']['B'][1]['myArray']
    pos = myArray.index('val2')
    myArray.insert(pos+1, 'val 3')
    if out_file_name is None:
        yaml.dump(data, sys.stdout, transform=sanitize.revert)
    else:
        with open(out_file_name, 'w') as fp:
            yaml.dump(data, out, transform=sanitize.revert)

update_one('input.yaml')

使用 Python 2.7 打印(为 update_one() 指定第二个参数以写入文件):

which prints (specify a second parameter to update_one() to write to a file) using Python 2.7:

# trying to update
{% set xyz = "123" }

A:
  B:
  - ip: 1.2.3.4
  - myArray:
    - {{ jinja.variable }}
    - val1
    - val2         # add a value after this one
    - val 3
    {% for d in data %}
    - phone: {{ d.phone }}
      name: {{ d.name }}
    {% endfor %}
    - {{ xyz }}
# #% or ##% should not be in the file and neither <{ or <<{

如果 #{<{ 都不在任何原始输入中,则可以使用简单的单行函数完成清理和恢复(请参阅 这篇文章的这个版本),然后你就不需要 Sanitize

If neither #{ nor <{ are in any of the original inputs then sanitizing and reverting can be done with simple one-line functions (see this versions of this post), and then you don't need the class Sanitize

您的示例缩进了一个位置(键 B)以及两个位置(序列元素),ruamel.yaml 没有那么精细的控制输出缩进(我不知道有任何 YAML 解析器这样做).缩进(默认为 2)适用于序列元素的两个 YAML 映射(测量到元素的开头,而不是破折号).这对重新读取 YAML 没有影响,并且也发生在其他两个回答者的输出上(他们没有指出这一变化).

Your example is indented with one position (key B) as well as two positions (the sequence elements), ruamel.yaml doesn't have that fine control over output indentation (and I don't know of any YAML parser that does). The indent (defaulting to 2) is applied to both YAML mappings as to sequence elements (measured to the beginning of the element, not to the dash). This has no influence on re-reading the YAML and happened to the output of the other two answerers as well (without them pointing out this change).

还要注意 YAML().load() 是安全的(即不会加载任意的潜在恶意对象),而 yaml.load() 使用在其他答案中绝对不安全,它在文档中这么说,甚至在 维基百科关于 YAML 的文章.如果您使用 yaml.load(),则必须检查每个输入文件以确保没有可能导致您的光盘被擦除(或更糟)的标记对象.

Also note that YAML().load() is safe (i.e. doesn't load arbitrary potentially malicious objects), whereas the yaml.load() as used in the other answers is definitely unsafe, it says so in the documentation and is even mentioned in the WikiPedia article on YAML. If you use yaml.load(), you would have to check each and every input file to make sure there are no tagged objects that could cause your disc to be wiped (or worse).

如果你需要重复更新你的文件,并控制 jinja2 模板,最好改变一次 jinja2 的模式而不是恢复它们,然后指定适当的 block_start_stringvariable_start_string(以及可能的 block_end_stringvariable_end_string)到 jinja2.FileSystemLoader 作为加载器添加到 jinja2.环境.

If you need to update your files repeatedly, and have control over the jinja2 templating, it might be better to change the patterns for jinja2 once and not revert them, and then specifying appropriate block_start_string, variable_start_string (and possible block_end_string and variable_end_string) to the jinja2.FileSystemLoader added as loader to the jinja2.Environment.

如果上面看起来很复杂,那么在 virtualenv 中做:

If the above seems to complicated then in a a virtualenv do:

pip install ruamel.yaml ruamel.yaml.jinja2

假设你在运行之前有 input.yaml :

assuming you have the input.yaml from before you can run:

import os
from ruamel.yaml import YAML


yaml = YAML(typ='jinja2')

with open('input.yaml') as fp:
    data = yaml.load(fp)

myArray = data['A']['B'][1]['myArray']
pos = myArray.index('val2')
myArray.insert(pos+1, 'val 3')

with open('output.yaml', 'w') as fp:
    yaml.dump(data, fp)

os.system('diff -u input.yaml output.yaml')

获取diff输出:

--- input.yaml  2017-06-14 23:10:46.144710495 +0200
+++ output.yaml 2017-06-14 23:11:21.627742055 +0200
@@ -8,6 +8,7 @@
     - {{ jinja.variable }}
     - val1
     - val2         # add a value after this one
+    - val 3
     {% for d in data %}
     - phone: {{ d.phone }}
       name: {{ d.name }}

ruamel.yaml 0.15.7 实现了一个新的插件机制,ruamel.yaml.jinja2 是一个插件,它透明地重新包装了这个答案中的代码用户.当前,还原信息附加到 YAML() 实例,因此请确保为您处理的每个文件执行 yaml = YAML(typ='jinja2')(即信息可以附加到顶级 data 实例,就像 YAML 注释一样).

ruamel.yaml 0.15.7 implements a new plug-in mechanism and ruamel.yaml.jinja2 is a plug-in that rewraps the code in this answer transparently for the user. Currently the information for reversion is attached to the YAML() instance, so make sure you do yaml = YAML(typ='jinja2') for each file you process (that information could be attached to the top-level data instance, just like the YAML comments are).

这篇关于如何使用 Python 更新 .yml 文件,忽略预先存在的 Jinja 语法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆