Python 3 Unicode编码错误 [英] Python 3 unicode encode error

查看:121
本文介绍了Python 3 Unicode编码错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用glob.glob从目录输入中获取文件列表.尝试打开上述文件时,Python会因以下错误反击我:

I'm using glob.glob to get a list of files from a directory input. When trying to open said files, Python fights me back with this error:

UnicodeEncodeError:'charmap'编解码器无法在位置18编码字符'\ xf8':字符映射为<未定义>

UnicodeEncodeError: 'charmap' codec can't encode character '\xf8' in position 18: character maps to < undefined >

通过首先定义一个字符串变量,我可以做到这一点:

By defining a string variable first, I can do this:

filePath = r"C:\Users\Jørgen\Tables\\"

是否可以通过某种方式获取变量的"r"编码?

Is there some way to get the 'r' encoding for a variable?

import glob

di = r"C:\Users\Jørgen\Tables\\"

def main():
    fileList = getAllFileURLsInDirectory(di)
    print(fileList)

def getAllFileURLsInDirectory(directory):
    return glob.glob(directory + '*.xls*')

还有很多代码,但是这个问题使过程停止了.

There is a lot more code, but this problem stops the process.

推荐答案

Python解释器必须独立于使用原始字符串文字还是常规字符串文字,都必须知道源代码编码.看来您使用的是8位编码,而不是UTF-8.因此,您必须像这样添加行

Independently on whether you use the raw string literal or a normal string literal, Python interpreter must know the source code encoding. It seems you use some 8-bit encoding, not the UTF-8. Therefore you have to add the line like

# -*- coding: cp1252 -*-

在文件开头(或使用用于源文件的另一种编码).不必是第一行,但通常是第一行或第二行(对于Windows上使用的脚本,第一行应包含#!python3).

at the beginning of the file (or using another encoding used for the source files). It need not to be the first line, but it usually is the first or second (the first should contain #!python3 for the script used on Windows).

无论如何,通常最好不要在文件/目录名称中使用非ASCII字符.

Anyway, it is usually better not to use non ASCII characters in the file/directory names.

您也可以在路径中使用普通斜杠(与基于Unix的系统中的斜杠相同).另外,请查看 os.path.join 当您需要组成路径时.

You can also use normal slashes in the path (the same way as in Unix-based systems). Also, have a look at os.path.join when you need to compose the paths.

已更新

问题可能不在您搜索的地方.我的猜测是,仅当您要通过print显示结果列表时,该错误才会出现.这通常是因为默认情况下,控制台使用了无法显示字符的非unicode编码.在cmd窗口中尝试不带参数的chcp命令.

The problem is probably not where you search it for. My guess is that the error manifests only when you want to display the resulting list via print. This is usually because the console by default uses non-unicode encoding that is not capable to display the character. Try the chcp command without arguments in your cmd window.

您可以在main()函数中修改print命令,以将字符串表示形式转换为始终可以显示的ASCII:

You can modify the print command in your main() function to convert the string representation to the ASCII one that can always be displayed:

print(ascii(fileList))

这篇关于Python 3 Unicode编码错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆