读取具有未知编码的非ASCII字符的文本文件 [英] Read a text file with non-ASCII characters in an unknown encoding

查看：599 发布时间：2017/8/16 20:34:29 python encoding

本文介绍了读取具有未知编码的非ASCII字符的文本文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想要读取一个还包含德语的文件，而不仅仅是字符。我发现我可以这样做

I want to read a file that contains also German and not only characters. I found that i can do like this

  >>> import codecs
  >>> file = codecs.open('file.txt','r', encoding='UTF-8')
  >>> lines= file.readlines()

当我尝试在Python IDLE中运行我的工作时，这是正常工作当我尝试从别的地方运行它不会给出正确的结果。有一个想法？

This is working when i try to run my job in Python IDLE but when i try to run it from somewhere else does not give correct result. Have a idea?

推荐答案

你需要知道编码文本的哪个字符，如果你不知道您可以尝试使用 chardet 模块进行猜测。首先安装它：

You need to know which character encoding the text is encoded in. If you don't know that beforehand, you can try guessing it with the chardet module. First install it:

$ pip install chardet

然后，例如以二进制模式读取文件：

Then, for example reading the file in binary mode:

>>> import chardet
>>> chardet.detect(open("file.txt", "rb").read())
{'confidence': 0.9690625, 'encoding': 'utf-8'}

所以，然后：

>>> import unicodedata
>>> lines = codecs.open('file.txt', 'r', encoding='utf-8').readlines()

这篇关于读取具有未知编码的非ASCII字符的文本文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

读取具有未知编码的非ASCII字符的文本文件 [英] Read a text file with non-ASCII characters in an unknown encoding

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

读取具有未知编码的非ASCII字符的文本文件 [英] Read a text file with non-ASCII characters in an unknown encoding

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭