使utf8在文件中可读 [英] Make utf8 readable in a file

查看:118
本文介绍了使utf8在文件中可读的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有字典的字典有utf8编码的密钥。我使用 json 模块将此字典转储到一个文件。

在文件键中以utf8格式打印。键实际上是孟加拉语的字母。

I have dictionary of dictionary which has utf8 encoded keys. I am dumping this dictionary to a file using json module.
In the file keys are printed in utf8 format. Keys are actually letters of Bengali language.

我想在文件中写入实际的字母。 如何做到这一点

I want actual letters to get written in the file. How to do this ??

如果我打印这些键(其中一个是u'\\\ং')到控制台实际字母ং)显示,但在我的文件 \\\ং .is写。

If I print those keys(one of them is u'\u0982') to console actual letter(ং) is shown but in my file \u0982.is written. What does print do to show the actual letter?

推荐答案

您正在编写JSON; JSON标准允许 \uxxxx 转义序列来编码非ASCII字符。默认情况下,Python json 模块使用此方法。

You are writing JSON; the JSON standard allows for \uxxxx escape sequences to encode non-ASCII characters. The Python json module uses this by default.

使用

json.dump(obj, yourfileobject, ensure_ascii=False)

这意味着输出不再被编码为UTF-8字节;您需要为此使用 codecs.open()管理文件:

This does mean that the output is no longer encoded to UTF-8 bytes as well; you'll need to use a codecs.open() managed file for this:

import json
import codecs

with codecs.open('/path/to/file', 'w', encoding='utf8') as output:
    json.dump(obj, output, ensure_ascii=False)

现在你的unicode字符将被写入该文件以UTF-8编码的字节代替。当使用另一个再次解码UTF-8的程序打开文件时,您的代码点应该再次显示为相同的字符。

Now your unicode characters will be written to the file as UTF-8 encoded bytes instead. When opening the file with another program that decodes UTF-8 again, your codepoints should be displayed again as the same characters.

这篇关于使utf8在文件中可读的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆