使用utf8编码的Perl脚本,可以打开编码为GB2312的文件名吗? [英] With a utf8-encoded Perl script, can it open a filename encoded as GB2312?

查看:232
本文介绍了使用utf8编码的Perl脚本,可以打开编码为GB2312的文件名吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不是在用utf-8或非utf-8编码和东西来阅读文件内容。这是关于文件名。通常我保存我的Perl脚本在系统默认编码,GB2312在我的情况下,我将不会有任何文件打开问题。但是为了处理的目的,我现在有一些保存在utf-8编码中的Perl脚本文件。问题是:这些脚本无法打开名称由GB2312编码的字符组成的文件,我不喜欢重命名文件的想法。





修改



显示我的问题:

 #我在Windows XP上运行ActivePerl 5.10.1(简体中文版)
#文件系统是NTFS

#!perl -w
使用autodie;

我的$ file =./测试.txt; #文件名由两个汉字
打开我的$ in,'<',$ file;

while(< $ in>){
print;
}

如果以ANSI编码保存,此测试脚本可以运行良好(我假设ANSI编码与GB2312相同,用于显示中文字符)。但是如果保存为UTF-8并且错误信息如下:

 
无法打开'。 /娴嬭瘯.txt'用于阅读:没有这样的文件或目录。

在此警告消息中,娴嬭瘯是无意义的垃圾字符。



更新



我尝试首先将文件名编码为GB2312,但似乎不起作用(
这是我的尝试:

 #!perl -w 
使用autodie;
使用Encode;

我的$ file =./测试.txt;
encode(gb2312,decode(utf-8,$ file));
打开我的$ in,'< ',$ file;

while(< $ in>){
print;
}


更新



我做了它终于做到了:)



@ brian的建议是正确的。我在上面的代码中犯了错误。我没有将编码的文件名返回给$文件。



这是解决方案:

 #!perl -w 
使用autodie;
使用编码;

我的$ file =./测试.txt;
$ file = encode(gb2312,decode(utf-8,$ file));
打开我的$ in,'<',$ file;

while(< $ in>){
print;
}


解决方案

如果您

 使用utf8;您的Perl脚本中的

,只是告诉perl源是UTF-8。它不影响perl与外界的交易。您是否开启任何其他Perl Unicode功能?



您是否对每个文件名有问题,还是其中一些?你可以给我们一些例子,还是一个小的示范脚本?我没有将名称编码为GB2312的文件系统,但是您在打开打开之前尝试将文件名编码为GB2312?



如果要使用具体编码,您可以使用编码模块。尝试使用您向打开的文件名打开


I'm not talking about reading in the file content in utf-8 or non-utf-8 encoding and stuff. It's about file names. Usually I save my Perl script in the system default encoding, "GB2312" in my case and I won't have any file open problems. But for processing purposes, I'm now having some Perl script files saved in utf-8 encoding. The problem is: these scripts cannot open the files whose names consist of characters encoded in "GB2312" encoding and I don't like the idea of having to rename my files.

Does anyone happen to have any experience in dealing with this kind of situation? Thanks like always for any guidance.

Edit

Here's the minimized code to demonstrate my problem:

# I'm running ActivePerl 5.10.1 on Windows XP (Simplified Chinese version)
# The file system is NTFS

#!perl -w
use autodie;

my $file = "./测试.txt"; #the file name consists of two Chinese characters
open my $in,'<',"$file";

while (<$in>){
print;
}

This test script can run well if saved in "ANSI" encoding (I assume ANSI encoding is the same as GB2312, which is used to display Chinese charcters). But it won't work if saved as "UTF-8" and the error message is as follows:

Can't open './娴嬭瘯.txt' for reading: 'No such file or directory'.

In this warning message, "娴嬭瘯" are meaningless junk characters.

Update

I tried first encoding the file name as GB2312 but it does not seem to work :( Here's what I tried:

#!perl -w
use autodie;
use Encode;

my $file = "./测试.txt";
encode("gb2312", decode("utf-8", $file));
open my $in,'<',"$file";

while (<$in>){
print;
}

My current thinking is: the file name in my OS is 测试.txt but it is encoded as GB2312. In the Perl script the file name looks the same to human eyes, still 测试.txt. But to Perl, they are different because they have different internal representations. But I don't understand why the problem persists when I already converted my file name in Perl to GB2312 as shown in the above code.

Update

I made it, finally made it :)

@brian's suggestion is right. I made a mistake in the above code. I didn't give the encoded file name back to the $file.

Here's the solution:

#!perl -w
use autodie;
use Encode;

my $file = "./测试.txt";
$file = encode("gb2312", decode("utf-8", $file));
open my $in,'<',"$file";

while (<$in>){
print;
}

解决方案

If you

 use utf8;

in your Perl script, that merely tells perl that the source is in UTF-8. It doesn't affect how perl deals with the outside world. Are you turning on any other Perl Unicode features?

Are you having problems with every filename, or just some of them? Can you give us some examples, or a small demonstration script? I don't have a filesystem that encodes names as GB2312, but have you tried encoding your filenames as GB2312 before you call open?

If you want specific strings encoded with a specific encoding, you can use the Encode module. Try that with your filenames that you give to open.

这篇关于使用utf8编码的Perl脚本,可以打开编码为GB2312的文件名吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆