为什么我的Perl测试在使用编码'utf8'时失败? [英] Why do my Perl tests fail with use encoding 'utf8'?

查看:71
本文介绍了为什么我的Perl测试在使用编码'utf8'时失败?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对这个测试脚本感到困惑:

I'm puzzled with this test script:

#!perl

use strict;
use warnings;
use encoding 'utf8';
use Test::More 'no_plan';

ok('áá' =~ m/á/, 'ok direct match');

my $re = qr{á};
ok('áá' =~ m/$re/, 'ok qr-based match');

like('áá', $re, 'like qr-based match');

这三个测试失败,但是我期望use encoding 'utf8'会将文字áá和基于qr的正则表达式都升级到utf8字符串,从而通过测试.

The three tests fail, but I was expecting that the use encoding 'utf8' would upgrade both the literal áá and the qr-based regexps to utf8 strings, and thus passing the tests.

如果我删除了use encoding行,那么测试将按预期通过,但是我无法弄清楚为什么它们在utf8模式下会失败.

If I remove the use encoding line the tests pass as expected, but I can't figure it out why would they fail in utf8 mode.

我正在Mac OS X(系统版本)上使用perl 5.8.8.

I'm using perl 5.8.8 on Mac OS X (system version).

推荐答案

请勿使用

Do not use the encoding pragma. It’s broken. (Juerd Waalboer gave a great talk where he mentioned this at YAPC::EU 2k8.)

它一次执行至少两项不属于一起的事情:

It does at least two things at once that do not belong together:

  1. 它为您的源文件指定一种编码.
  2. 它指定文件输入/输出的编码.

为了增加侮辱性伤害,它还以断线的方式执行#1:它将\xNN序列重新解释为未解码的八位字节,而不是将它们视为代码点,并对其进行解码,从而使您无法在外部表达字符您指定的编码和使您的源代码取决于编码的含义不同.那真是太错误了.

And to add injury to insult it also does #1 in a broken fashion: it reinterprets \xNN sequences as being undecoded octets as opposed to treating them like codepoints, and decodes them, preventing you from being able to express characters outside the encoding you specified and making your source code mean different things depending on the encoding. That’s just astonishingly wrong.

仅以ASCII或UTF-8格式编写源代码.在后一种情况下, utf8 pragma 是正确的选择使用.如果您不想使用UTF-8,但确实想包含非ASCII字符,请对其进行转义或显式解码.

Write your source code in ASCII or UTF-8 only. In the latter case, the utf8 pragma is the correct thing to use. If you don’t want to use UTF-8, but you do want to include non-ASCII charcters, escape or decode them explicitly.

然后显式使用I/O层,或使用 open编译指示进行设置> ,以使I/O自动正确转码.

And use I/O layers explicitly or set them using the open pragma to have I/O automatically transcoded properly.

这篇关于为什么我的Perl测试在使用编码'utf8'时失败?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆