Python:如何将具有Unicode文件名的文件移动到Unicode文件夹 [英] Python: How to move a file with unicode filename to a unicode folder
问题描述
我在Windows下使用Python脚本在名为unicode的文件夹之间移动名为unicode的文件时遇到了麻烦...
I'm having hell with moving a unicode named file between unicode named folders in a Python script under Windows...
您将使用什么语法在文件夹中查找* .ext类型的所有文件并将其移至相对位置?
What syntax would you use to find all files of type *.ext in a folder and move them to a relative location?
假设文件和文件夹是unicode.
Assume files and folders are unicode.
推荐答案
基本问题是Unicode和字节字符串之间未转换的混合.解决方案可以转换为单一格式,也可以避免一些麻烦而避免出现问题.我所有的解决方案都包括glob
和shutil
标准库.
The basic problem is the unconverted mix between Unicode and byte strings. The solutions can be converting to a single format or avoiding the problems using some trickery. All of my solutions include the glob
and shutil
standard library.
为方便起见,我有一些以ods
结尾的Unicode文件名,我想将它们移动到名为א
(希伯来语Aleph,一个Unicode字符)的子目录中.
For the sake of example, I have some Unicode filenames ending with ods
, and I want to move them to the subdirectory called א
(Hebrew Aleph, a unicode character).
>>> import glob
>>> import shutil
>>> files=glob.glob('*.ods') # List of Byte string file names
>>> for file in files:
... shutil.copy2(file, 'א') # Byte string directory name
...
第二个解决方案-将文件名转换为Unicode:
>>> import glob
>>> import shutil
>>> files=glob.glob(u'*.ods') # List of Unicode file names
>>> for file in files:
... shutil.copy2(file, u'א') # Unicode directory name
向Ezio Melotti致谢, Python错误列表.
Credit to the Ezio Melotti, Python bug list.
尽管我认为这不是最佳解决方案,但这里有一个不错的技巧值得一提.
Although this isn't the best solution in my opinion, there is a nice trick here that's worth mentioning.
使用os.getcwd()
将目录更改为目标目录,然后将其引用为.
将文件复制到该目录:
Change your directory to the destination directory using os.getcwd()
, and then copy the files to it by referring to it as .
:
# -*- coding: utf-8 -*-
import os
import shutil
import glob
os.chdir('א') # CD to the destination Unicode directory
print os.getcwd() # DEBUG: Make sure you're in the right place
files=glob.glob('../*.ods') # List of Byte string file names
for file in files:
shutil.copy2(file, '.') # Copy each file
# Don't forget to go back to the original directory here, if it matters
更深入的解释
直接方法shutil.copy2(src, dest)
失败,因为shutil
将具有ASCII字符串的unicode串联而无需进行转换:
Deeper explanation
The straightforward approach shutil.copy2(src, dest)
fails because shutil
concatenates a unicode with ASCII string without conversions:
>>> files=glob.glob('*.ods')
>>> for file in files:
... shutil.copy2(file, u'א')
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/usr/lib/python2.6/shutil.py", line 98, in copy2
dst = os.path.join(dst, os.path.basename(src))
File "/usr/lib/python2.6/posixpath.py", line 70, in join
path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 1:
ordinal not in range(128)
如前所述,使用'א'
代替Unicode u'א'
As seen before, this can be avoided when using 'א'
instead of the Unicode u'א'
在我看来,这是一个错误,因为Python不能期望basedir
名称始终为str
,而不是unicode
.我已经在Python Buglist中将此问题报告为一个问题,并等待响应.
In my opinion, this is bug, because Python cannot expect basedir
names to be always str
, not unicode
. I have reported this as an issue in the Python buglist, and waiting for responses.
这篇关于Python:如何将具有Unicode文件名的文件移动到Unicode文件夹的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!