如何将外来编码字符写入文本文件 [英] How to write foreign encoded characters to a text file

查看:130
本文介绍了如何将外来编码字符写入文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在遍历文件夹,并收集文档名称和其他一些要加载到数据库中的数据.

I am recursing through folders and gathering the document names and some other data to be loaded into a database.

import os
text_file = open("Output.txt", "w")

dirName = 'D:\\'
for nextDir, subDir, fileList in os.walk(dirName):
    for fname in fileList: 
        text_file.write(fname + '\n')

问题在于某些文档名称具有外来字符,例如:

The problem is that some document names have foreign characters like:

RC-0964_1000 Tưởng thưởng Diamond trẻ nhất Việt Nam - Đặng Việt Thắng và Trần Thu Phương

还有

RC-1046 安麗2013ARTISTRY冰上雅姿盛典-愛里歐娜.薩維琴科_羅賓.索爾科維【Suit & Tie】.mp4

上面的代码在最后一行给了我这个错误:

And the code above gives me this error on the last line:

UnicodeEncodeError: 'charmap' codec can't encode characters at positions ##-##:character maps to (undefined)

我试图

  • temp = fname.endcode(utf-8)
  • temp = fname.decode(utf-8)
  • temp = fname.encode('ascii','ignore') temp2 = temp.decode('ascii')
  • temp =unicode(fname).encode('utf8')
  • temp = fname.endcode(utf-8)
  • temp = fname.decode(utf-8)
  • temp = fname.encode('ascii','ignore') temp2 = temp.decode('ascii')
  • temp =unicode(fname).encode('utf8')

如何编写此脚本以将所有字符写入文件?我需要更改正在写入的文件或正在写入的字符串吗?

How can I write this script to write all characters to the file? Do I need to change the file I'm writing to or the string I'm writing, and how?

这些名称可以成功粘贴到文件中,所以Python为什么不将它们写入其中?

These names can be pasted into the file successfully, so why won't Python write them in?

推荐答案

由于它是Python 3,因此请选择一种支持所有Unicode的编码.至少在Windows上,默认设置取决于语言环境,例如cp1252,并且对于中文这样的字符将失败.

Since it is Python 3, choose an encoding that supports all of Unicode. On Windows, at least, the default is locale dependent, such as cp1252, and will fail for characters like Chinese.

text_file = open("Output.txt", "w", encoding='utf8')

这篇关于如何将外来编码字符写入文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆