在不破坏锚和别名的情况下读写 YAML 文件? [英] Read and write YAML files without destroying anchors and aliases?

查看:43
本文介绍了在不破坏锚和别名的情况下读写 YAML 文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要打开一个包含别名的 YAML 文件:

I need to open a YAML file with aliases used inside it:

defaults: &defaults
  foo: bar
  zip: button

node:
  <<: *defaults
  foo: other

这显然扩展为等效的 YAML 文档:

This obviously expands out to an equivalent YAML document of:

defaults:
  foo: bar
  zip: button

node:
  foo: other
  zip: button

哪个 YAML::load 读取它.

我需要在这个 YAML 文档中设置新的密钥,然后将它写回磁盘,尽可能保留原始结构.

I need to set new keys in this YAML document and then write it back out to disk, preserving the original structure as much as possible.

我看过YAML::Store,但这完全破坏了别名和锚点.

I have looked at YAML::Store, but this completely destroys the aliases and anchors.

是否有任何可用的东西可以类似于以下内容:

Is there anything available that could something along the lines of:

thing = Thing.load("config.yml")
thing[:node][:foo] = "yet another"

将文档另存为:

defaults: &defaults
  foo: bar
  zip: button

node:
  <<: *defaults
  foo: yet another

?

我为此选择使用 YAML,因为它可以很好地处理这种别名,但编写包含别名的 YAML 在现实中似乎有点黯淡.

I opted to use YAML for this due to the fact it handles this aliasing well, but writing YAML that contains aliases appears to be a bit of a bleak-looking playing field in reality.

推荐答案

使用 << 来指示应合并到当前映射中的别名映射不是核心 Yaml 规范,但它是标签库的一部分.

The use of << to indicate an aliased mapping should be merged in to the current mapping isn’t part of the core Yaml spec, but it is part of the tag repository.

Ruby 提供的当前 Yaml 库 – Psych – 提供了 dumpload 方法,它们允许轻松地序列化和反序列化 Ruby 对象并使用各种隐式类型转换标签存储库包括 << 以合并哈希.如果需要,它还提供了进行更底层 Yaml 处理的工具.不幸的是,它不容易允许有选择地禁用或启用标签存储库的特定部分——这是一个全有或全无的事情.特别是处理<< 非常适合处理散列.

The current Yaml library provided by Ruby – Psych – provides the dump and load methods which allow easy serialization and deserialization of Ruby objects and use the various implicit type conversion in the tag repository including << to merge hashes. It also provides tools to do more low level Yaml processing if you need it. Unfortunately it doesn’t easily allow selectively disabling or enabling specific parts of the tag repository – it’s an all or nothing affair. In particular the handling of << is pretty baked in to the handling of hashes.

实现您想要的一种方法是提供您自己的 Psych ToRuby 类的子类并覆盖此方法,以便它只处理 << 作为文字.这涉及在 Psych 中覆盖私有方法,因此您需要小心一点:

One way to achieve what you want is to provide your own subclass of Psych’s ToRuby class and override this method, so that it just treats mapping keys of << as literals. This involves overriding a private method in Psych, so you need to be a little careful:

require 'psych'

class ToRubyNoMerge < Psych::Visitors::ToRuby
  def revive_hash hash, o
    @st[o.anchor] = hash if o.anchor

    o.children.each_slice(2) { |k,v|
      key = accept(k)
      hash[key] = accept(v)
    }
    hash
  end
end

然后你会像这样使用它:

You would then use it like this:

tree = Psych.parse your_data
data = ToRubyNoMerge.new.accept tree

使用示例中的 Yaml,data 看起来像

With the Yaml from your example, data would then look something like

{"defaults"=>{"foo"=>"bar", "zip"=>"button"},
 "node"=>{"<<"=>{"foo"=>"bar", "zip"=>"button"}, "foo"=>"other"}}

注意 << 作为文字键.此外,data["defaults"] 键下的哈希值与 data["node"]["<<same 哈希值下的哈希值相同."] 键,即它们具有相同的 object_id.您现在可以根据需要操作数据,当您将其写成 Yaml 时,锚点和别名仍然存在,尽管锚点名称已更改:

Note the << as a literal key. Also the hash under the data["defaults"] key is the same hash as the one under the data["node"]["<<"] key, i.e. they have the same object_id. You can now manipulate the data as you want, and when you write it out as Yaml the anchors and aliases will still be in place, although the anchor names will have changed:

data['node']['foo'] = "yet another"
puts Yaml.dump data

产生(Psych 使用散列的 object_id 来确保唯一的锚点名称(当前版本的 Psych 现在使用序列号而不是 object_id)):

produces (Psych uses the object_id of the hash to ensure unique anchor names (the current version of Psych now uses sequential numbers rather than object_id)):

---
defaults: &2151922820
  foo: bar
  zip: button
node:
  <<: *2151922820
  foo: yet another

如果您想控制锚点名称,您可以提供自己的Psych::Visitors::Emitter.这是一个基于您的示例并假设只有一个锚点的简单示例:

If you want to have control over the anchor names, you can provide your own Psych::Visitors::Emitter. Here’s a simple example based on your example and assuming there’s only the one anchor:

class MyEmitter < Psych::Visitors::Emitter
  def visit_Psych_Nodes_Mapping o
    o.anchor = 'defaults' if o.anchor
    super
  end

  def visit_Psych_Nodes_Alias o
    o.anchor = 'defaults' if o.anchor
    super
  end
end

当与上面修改的 data 哈希一起使用时:

When used with the modified data hash from above:

#create an AST based on the Ruby data structure
builder = Psych::Visitors::YAMLTree.new
builder << data
ast = builder.tree

# write out the tree using the custom emitter
MyEmitter.new($stdout).accept ast

输出为:

---
defaults: &defaults
  foo: bar
  zip: button
node:
  <<: *defaults
  foo: yet another

(更新: 另一个问题询问了如何使用多个锚点执行此操作,我想出了一个在序列化时保留锚名称的更好方法.)

(Update: another question asked how to do this with more than one anchor, where I came up with a possibly better way to keep anchor names when serializing.)

这篇关于在不破坏锚和别名的情况下读写 YAML 文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆