Python os.stat和Unicode文件名 [英] Python os.stat and unicode file names

查看:224
本文介绍了Python os.stat和Unicode文件名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的Django应用程序中,用户上传了一个名称为unicode字符的文件.

下载文件时,我叫:

os.path.exists(media)

测试文件是否存在.反过来,这似乎在呼唤

st = os.stat(path)

然后出现错误:

UnicodeEncodeError:'ascii'编解码器无法在位置92编码字符u'\ xcf':序数不在范围(128)中

对此我该怎么办?是否可以选择path.exists处理它?<​​/p>

更新:实际上,我要做的就是对存在的参数进行编码,即.

os.path.exists(media.encode('utf-8')

感谢所有回答的人.

解决方案

我假设您使用的是Unix.如果没有,请记住说出您所在的操作系统.

确保您的语言环境设置为UTF-8.默认情况下,所有现代Linux系统通常都通过将环境变量LANG设置为"en_US.UTF-8"或另一种语言来执行此操作.另外,请确保您的文件名使用UTF-8编码.

有了这个设置,就无需弄乱编码来访问任何语言的文件,即使在Python 2.x中也是如此.

[~/test] echo $LANG
en_US.UTF-8
[~/test] echo testing > 漢字
[~/test] python2.6
Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.stat("漢字")
posix.stat_result(st_mode=33188, st_ino=548583333L, st_dev=2049L, st_nlink=1, st_uid=1000, st_gid=1000, st_size=8L, st_atime=1263634240, st_mtime=1263634230, st_ctime=1263634230)
>>> os.stat(u"漢字")
posix.stat_result(st_mode=33188, st_ino=548583333L, st_dev=2049L, st_nlink=1, st_uid=1000, st_gid=1000, st_size=8L, st_atime=1263634240, st_mtime=1263634230, st_ctime=1263634230)
>>> open("漢字").read()
'testing\n'
>>> open(u"漢字").read()
'testing\n'

如果这不起作用,请运行"locale";如果值为"C"而不是en_US.UTF-8,则可能未正确安装语言环境.

如果您使用的是Windows,我认为Unicode文件名应该总是可以正常工作(至少对于os/posix模块而言),因为Windows中的Unicode文件API是透明支持的.

In my Django application, a user has uploaded a file with a unicode character in the name.

When I'm downloading files, I'm calling :

os.path.exists(media)

to test that the file is there. This, in turn, seems to call

st = os.stat(path)

Which then blows up with the error :

UnicodeEncodeError: 'ascii' codec can't encode character u'\xcf' in position 92: ordinal not in range(128)

What can I do about this? Is there an option to path.exists to handle it?

Update : Actually, all I had to do was encode the argument to exists, ie.

os.path.exists(media.encode('utf-8')

Thanks everyone who answered.

解决方案

I'm assuming you're in Unix. If not, please remember to say which OS you're in.

Make sure your locale is set to UTF-8. All modern Linux systems do this by default, usually by setting the environment variable LANG to "en_US.UTF-8", or another language. Also, make sure your filenames are encoded in UTF-8.

With that set, there's no need to mess with encodings to access files in any language, even in Python 2.x.

[~/test] echo $LANG
en_US.UTF-8
[~/test] echo testing > 漢字
[~/test] python2.6
Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.stat("漢字")
posix.stat_result(st_mode=33188, st_ino=548583333L, st_dev=2049L, st_nlink=1, st_uid=1000, st_gid=1000, st_size=8L, st_atime=1263634240, st_mtime=1263634230, st_ctime=1263634230)
>>> os.stat(u"漢字")
posix.stat_result(st_mode=33188, st_ino=548583333L, st_dev=2049L, st_nlink=1, st_uid=1000, st_gid=1000, st_size=8L, st_atime=1263634240, st_mtime=1263634230, st_ctime=1263634230)
>>> open("漢字").read()
'testing\n'
>>> open(u"漢字").read()
'testing\n'

If this doesn't work, run "locale"; if the values are "C" instead of en_US.UTF-8, you may not have the locale installed correctly.

If you're in Windows, I think Unicode filenames should always just work (at least for the os/posix modules), since the Unicode file API in Windows is supported transparently.

这篇关于Python os.stat和Unicode文件名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆