如何在 Perl 中将命令行参数视为 UTF-8? [英] How can I treat command-line arguments as UTF-8 in Perl?

查看:24
本文介绍了如何在 Perl 中将命令行参数视为 UTF-8?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在 Perl 中将 @ARGV 的元素视为 UTF-8?

How do I treat the elements of @ARGV as UTF-8 in Perl?

目前我正在使用以下解决方法..

Currently I'm using the following work-around ..

use Encode qw(decode encode);

my $foo = $ARGV[0];
$foo = decode("utf-8", $foo);

...虽然有效但不是很优雅.

.. which works but is not very elegant.

我使用的是从 bash v3.2.25 调用的 Perl v5.8.8,并将 LANG 设置为 en_US.UTF-8.

I'm using Perl v5.8.8 which is being called from bash v3.2.25 with a LANG set to en_US.UTF-8.

推荐答案

外部数据源在 Perl 中很棘手.对于命令行参数,您可能将它们作为在您的语言环境中指定的编码来获取.不要依赖您的语言环境与可能运行您的程序的其他人相同.

Outside data sources are tricky in Perl. For command-line arguments, you're probably getting them as the encoding specified in your locale. Don't rely on your locale to be the same as someone else who might run your program.

您必须找出要转换为 Perl 内部格式的内容.幸运的是,这并不难.

You have to find out what that is then convert to Perl's internal format. Fortunately, it's not that hard.

I18N::Langinfo 模块具有获取编码所需的内容:

The I18N::Langinfo module has the stuff you need to get the encoding:

    use I18N::Langinfo qw(langinfo CODESET);
    my $codeset = langinfo(CODESET);

一旦知道编码,就可以将它们解码为 Perl 字符串:

Once you know the encoding, you can decode them to Perl strings:

    use Encode qw(decode);
    @ARGV = map { decode $codeset, $_ } @ARGV;

尽管 Perl 将内部字符串编码为 UTF-8,但您永远不应该考虑或知道这一点.你只需解码你得到的任何东西,这将它变成 Perl 的内部表示.相信 Perl 会处理其他一切.当您需要存储数据时,请确保使用您喜欢的编码.

Although Perl encodes internal strings as UTF-8, you shouldn't ever think or know about that. You just decode whatever you get, which turns it into Perl's internal representation for you. Trust that Perl will handle everything else. When you need to store the data, ensure that you use the encoding you like.

如果你知道你的设置是 UTF-8 并且终端会给你 UTF-8 的命令行参数,你可以使用 A 选项和 Perl 的 -C 开关.这告诉您的程序假设参数被编码为 UTF-8:

If you know that your setup is UTF-8 and the terminal will give you the command-line arguments as UTF-8, you can use the A option with Perl's -C switch. This tells your program to assume the arguments are encoded as UTF-8:

% perl -CA program

这篇关于如何在 Perl 中将命令行参数视为 UTF-8?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆