使用非Ascii字符打印文件名 [英] Printing Filenames with non-Ascii-Characters

查看:114
本文介绍了使用非Ascii字符打印文件名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述




我是Python新手,遇到了以下问题。如果我这样做

类似


dir = os.listdir(somepath)

for d in dir:

print d


对于包含非ascii字符的文件名,程序失败。


''ascii''编解码器可以'在33-34位置编码字符:


我注意到这似乎是一个非常常见的问题。我已经阅读了很多

的帖子,但没有真正找到解决方案。是否有一个简单的

一个?


我特别不明白的是为什么Python想要将

字符串解释为ASCII一点都不这个设置隐藏在哪里?


我在Windows XP上运行Python 2.3.4并且我希望稍后运行该程序

Debian sarge。 />

Ciao,MM

-

Marian Aldenh?vel,Rosenhain 23,53123 Bonn。 +49 228 624013.
http://www.marian-aldenhoevel.de

在这些情况下有一个程序可以遵循,如果遵循它可以

很好地保证了成功的大方,这里的成功

定义为主要四肢仍然附着的生存。

解决方案

2005年2月1日星期二20:28:11 +0100, Marian Aldenh?vel

< ma **** @ mba-software.de>写道:



我是Python新手,遇到了以下问题。如果我在目录中做了类似

dir = os.listdir(somepath)
d d:
print d

该程序对于包含非ascii字符的文件名失败。

''ascii''编解码器无法编码位置33-34中的字符:

我注意到这似乎是一个非常普遍的问题。我已经阅读了很多关于它的帖子,但没有真正找到解决方案。有一个简单的
吗?


英文windows命令提示符使用cp437 charset。要打印它,请使用


print d.encode(''cp437'')


问题是终端只能理解某些字符集。如果你有

unicode字符串,就像在你的情况下d,你必须先编码它才能打印出
。 (我们真的需要原生的unicode终端!!!)如果你没有
编码,Python会为你做。默认编码为ASCII。任何包含非ASCII字符的

字符串都会给您带来麻烦。在我的

意见中,Python过于激动,无法使用''严格''编码,

让用户不知道unicode会造成很多困难。

那么你是如何得到一个单独的d开头的?如果''somepath''是unicode,

os.listdir返回一个unicode列表。那么为什么有些路径是unicode?

你输入了一个unicode文字,或者来自其他一些来源。

一个可能的来源是XML解析器,它以unicode返回字符串。


Windows NT支持unicode文件名。我不确定Linux。结果

可能略有不同。


我特别不明白的是为什么Python想要将
字符串解释为ASCII一点都不这个设置隐藏在哪里?

我在Windows XP上运行Python 2.3.4,我想稍后在Debian sarge上运行该程序。

Ciao, MM




Marian Aldenh?vel写道:



我是Python的新手,遇到了以下问题。如果我在
做了类似

dir = os.listdir(somepath)
for d in dir:
print d

该程序对于包含非ascii字符的文件名失败。

''ascii''编解码器无法编码位置33-34中的字符:

我注意到这似乎是一个非常普遍的问题。我有
阅读了很多关于它的帖子,但没有真正找到解决方案。有一个简单的
吗?


否:)你正在尝试处理传统的终端,你不可能在各种终端上打印unicode字符
。这不是真的。/ b $ b $ Python的错误。

我特别不理解的是为什么Python想要
将字符串解释为ASCII 。这个设置隐藏在哪里?

http://www.python.org/moin/ PrintFails 如果不清楚,请告诉我。如果其他人修复/改进了这个页面,那么
会很棒。

我在Windows XP上运行Python 2.3.4并且我想运行程序
在Debian sarge之后。




您需要支持unicode输出的跨平台终端。

谢尔盖。


Marian Aldenh?vel写道:



我是Python的新手,遇到了以下问题。如果我在目录中做了类似

dir = os.listdir(somepath)
d d:
print d

该程序对于包含非ascii字符的文件名失败。

''ascii''编解码器不能编码33-34位的字符:


如果你仔细阅读,你会注意到Python已经尝试过,并且使用''ascii''*
编解码器无法*编码*解码(= unicode)字符串。 IOW,d似乎绑定了一个unicode字符串。这是意料之外的

,除非传递给os.listdir(somepath)的参数也是Unicode

字符串。 (如果给出一个Unicode字符串作为参数,os.listdir将

将列表作为unicode名称列表返回)。


如果你要打印对于控制台,现代Pythons将尝试猜测

控制台的编码(例如cp850)。如果

打印失败,我会期待一个UnicodeEncodeError,因为这些字符没有映射到控制台的

编码,而不是你看到的错误。 />

你是如何运行程序的*。在控制台(cmd.exe)?或者来自

一些IDE?

我注意到这似乎是一个非常常见的问题。我已经阅读了很多关于它的帖子,但没有真正找到解决方案。有一个简单的吗?

我特别不明白的是为什么Python想要将
字符串解释为ASCII。这个设置隐藏在哪里?


不要试图在site.py中更改sys.defaultencoding,这是特定于站点的b $ b,这意味着如果你分发它们,程序

依赖于这个设置可能会失败其他人的Python安装。


-

Vincent Wehren

我在Windows XP上运行Python 2.3.4,我想稍后在Debian sarge上运行该程序。

Ciao,MM


Hi,

I am very new to Python and have run into the following problem. If I do
something like

dir = os.listdir(somepath)
for d in dir:
print d

The program fails for filenames that contain non-ascii characters.

''ascii'' codec can''t encode characters in position 33-34:

I have noticed that this seems to be a very common problem. I have read a lot
of postings regarding it but not really found a solution. Is there a simple
one?

What I specifically do not understand is why Python wants to interpret the
string as ASCII at all. Where is this setting hidden?

I am running Python 2.3.4 on Windows XP and I want to run the program on
Debian sarge later.

Ciao, MM
--
Marian Aldenh?vel, Rosenhain 23, 53123 Bonn. +49 228 624013.
http://www.marian-aldenhoevel.de
"There is a procedure to follow in these cases, and if followed it can
pretty well guarantee a generous measure of success, success here
defined as survival with major extremities remaining attached."

解决方案

On Tue, 01 Feb 2005 20:28:11 +0100, Marian Aldenh?vel
<ma****@mba-software.de> wrote:

Hi,

I am very new to Python and have run into the following problem. If I do
something like

dir = os.listdir(somepath)
for d in dir:
print d

The program fails for filenames that contain non-ascii characters.

''ascii'' codec can''t encode characters in position 33-34:

I have noticed that this seems to be a very common problem. I have read
a lot
of postings regarding it but not really found a solution. Is there a
simple
one?
English windows command prompt uses cp437 charset. To print it, use

print d.encode(''cp437'')

The issue is a terminal only understand certain character set. If you have
unicode string, like d in your case, you have to encode it before it can
be printed. (We really need native unicode terminal!!!) If you don''t
encode, Python will do it for you. The default encoding is ASCII. Any
string that contains non-ASCII character will give you trouble. In my
opinion Python is too conversative to use the ''strict'' encoding which
gives users unaware of unicode a lot of woes.

So how did you get a unicoded d to start with? If ''somepath'' is unicode,
os.listdir returns a list of unicode. So why is somepath unicode? Either
you have entered a unicode literal or it comes from some other sources.
One possible source is XML parser, which returns string in unicode.

Windows NT support unicode filename. I''m not sure about Linux. The result
maybe slightly differ.


What I specifically do not understand is why Python wants to interpret
the
string as ASCII at all. Where is this setting hidden?

I am running Python 2.3.4 on Windows XP and I want to run the program on
Debian sarge later.

Ciao, MM




Marian Aldenh?vel wrote:

Hi,

I am very new to Python and have run into the following problem. If I do something like

dir = os.listdir(somepath)
for d in dir:
print d

The program fails for filenames that contain non-ascii characters.

''ascii'' codec can''t encode characters in position 33-34:

I have noticed that this seems to be a very common problem. I have read a lot of postings regarding it but not really found a solution. Is there a simple one?
No :) You''re trying to deal with legacy terminals, you can''t reliably
print unicode characters across various terminals. It''s not really
Python''s fault.

What I specifically do not understand is why Python wants to interpret the string as ASCII at all. Where is this setting hidden?
http://www.python.org/moin/PrintFails Let me know if it''s not clear. It
would be great if other people fixed/improved this page.
I am running Python 2.3.4 on Windows XP and I want to run the program on Debian sarge later.



You need cross platform terminal that supports unicode output.
Sergey.


Marian Aldenh?vel wrote:

Hi,

I am very new to Python and have run into the following problem. If I do
something like

dir = os.listdir(somepath)
for d in dir:
print d

The program fails for filenames that contain non-ascii characters.

''ascii'' codec can''t encode characters in position 33-34:
If you read this carefully, you''ll notice that Python has tried and
failed to *encode* a decoded ( = unicode) string using the ''ascii''
codec. IOW, d seems to be bound to a unicode string. Which is unexpected
unless maybe the argument passed to os.listdir (somepath) is a Unicode
string, too. (If given a Unicode string as argument, os.listdir will
return the list as a list of unicode names).

If you''re printing to the console, modern Pythons will try to guess the
console''s encoding (e.g. cp850). I would expect a UnicodeEncodeError if
the print fails because the characters do not map to the console''s
encoding, not the error you''re seeing.

How *are* you running the program. In the console (cmd.exe)? Or from
some IDE?

I have noticed that this seems to be a very common problem. I have read
a lot
of postings regarding it but not really found a solution. Is there a simple
one?

What I specifically do not understand is why Python wants to interpret the
string as ASCII at all. Where is this setting hidden?
Don''t be tempted to ever change sys.defaultencoding in site.py, this is
site specific, meaning that if you ever distribute them, programs
relying on this setting may fail on other people''s Python installations.

--
Vincent Wehren

I am running Python 2.3.4 on Windows XP and I want to run the program on
Debian sarge later.

Ciao, MM



这篇关于使用非Ascii字符打印文件名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆