如何强制 PyYAML 将字符串加载为 unicode 对象? [英] How to force PyYAML to load strings as unicode objects?

查看:43
本文介绍了如何强制 PyYAML 将字符串加载为 unicode 对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

PyYAML 包将未标记的字符串加载为 unicode 或 str 对象,具体取决于它们的内容.

我想在整个程序中使用 unicode 对象(不幸的是,目前还不能切换到 Python 3).

是否有一种简单的方法可以强制 PyYAML 始终以字符串加载 unicode 对象?我不想用 !!python/unicode 标签弄乱我的 YAML.

# 编码:UTF-8导入 yaml菜单=你"""---- 垃圾邮件- 蛋- 培根- 焦糖布蕾- 垃圾邮件"""打印 yaml.load(菜单)

输出:['spam', 'eggs', 'bacon', u'cr\xe8me br\xfbl\xe9e', 'spam']

我想要:[u'spam', u'eggs', u'bacon', u'cr\xe8me br\xfbl\xe9e', u'spam']

解决方案

这是一个通过始终输出 unicode 来覆盖 PyYAML 处理字符串的版本.实际上,这可能是我发布的其他响应的相同结果,只是更短(即您仍然需要确保自定义类中的字符串转换为 unicode 或传递 unicode> 如果您使用自定义处理程序,则自己使用字符串):

# -*- 编码:utf-8 -*-导入 yaml从 yaml 导入加载器,安全加载器def constructor_yaml_str(self, node):# 覆盖默认的字符串处理函数# 总是返回 unicode 对象返回 self.construct_scalar(node)Loader.add_constructor(u'tag:yaml.org,2002:str',construct_yaml_str)SafeLoader.add_constructor(u'tag:yaml.org,2002:str',construct_yaml_str)打印 yaml.load(u"""---- 垃圾邮件- 蛋- 培根- 焦糖布蕾- 垃圾邮件""")

(上面给出了[u'spam', u'eggs', u'bacon', u'cr\xe8me br\xfbl\xe9e', u'spam'])

我没有在 LibYAML(基于 c 的解析器)上测试它,因为我无法编译它,所以我将保留其他答案.

The PyYAML package loads unmarked strings as either unicode or str objects, depending on their content.

I would like to use unicode objects throughout my program (and, unfortunately, can't switch to Python 3 just yet).

Is there an easy way to force PyYAML to always strings load unicode objects? I do not want to clutter my YAML with !!python/unicode tags.

# Encoding: UTF-8

import yaml

menu= u"""---
- spam
- eggs
- bacon
- crème brûlée
- spam
"""

print yaml.load(menu)

Output: ['spam', 'eggs', 'bacon', u'cr\xe8me br\xfbl\xe9e', 'spam']

I would like: [u'spam', u'eggs', u'bacon', u'cr\xe8me br\xfbl\xe9e', u'spam']

解决方案

Here's a version which overrides the PyYAML handling of strings by always outputting unicode. In reality, this is probably the identical result of the other response I posted except shorter (i.e. you still need to make sure that strings in custom classes are converted to unicode or passed unicode strings yourself if you use custom handlers):

# -*- coding: utf-8 -*-
import yaml
from yaml import Loader, SafeLoader

def construct_yaml_str(self, node):
    # Override the default string handling function 
    # to always return unicode objects
    return self.construct_scalar(node)
Loader.add_constructor(u'tag:yaml.org,2002:str', construct_yaml_str)
SafeLoader.add_constructor(u'tag:yaml.org,2002:str', construct_yaml_str)

print yaml.load(u"""---
- spam
- eggs
- bacon
- crème brûlée
- spam
""")

(The above gives [u'spam', u'eggs', u'bacon', u'cr\xe8me br\xfbl\xe9e', u'spam'])

I haven't tested it on LibYAML (the c-based parser) as I couldn't compile it though, so I'll leave the other answer as it was.

这篇关于如何强制 PyYAML 将字符串加载为 unicode 对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆