PCRE编码支持 [英] PCRE Encoding Support

查看:81
本文介绍了PCRE编码支持的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 PCRE文档中看到,PCRE支持UTF-8和Unicode常规
类别属性,但我看不到它说本地编码支持。

I saw in the PCRE Documentation that PCRE support UTF-8 and Unicode general category properties, but i dont see where it say the Native encoding support.

如果您说支持ISO-8859-1:
在哪里可以找到

If you say that support ISO-8859-1: where can i found info about that?

简而言之:

我已经比较了&我猜想 PHP 支持的编码是Windows-1252,而不是ISO-8859-1编码。

Ive compared & im guessing that the encoding supported by PHP is windows-1252 and not the ISO-8859-1 encoding.

if(preg_match('/€/',"\x80"))
    echo "Match";

ISO-8859-1的位置不存在€。 Windows-1252可以。
还是取决于系统?

ISO-8859-1 doesn't have the '€' in that position. Windows-1252 does. Or dependes of the system?

那么本机编码是否支持PCRE?

So wich is the native encoding PCRE Support?

推荐答案

正好在regular-expressions.info上使用了此示例,以描述将8bit和unicode混合使用的困难

Exactly this Example is used on regular-expressions.info to describe the difficulties from mixing 8bit and unicode

混合Unicode和8位字符代码

,欧元符号在所有Windows代码页上的 80h 上。您的正则表达式引擎如何处理此问题可能会有所不同。当您的正则表达式引擎是8位并且文本文件使用Windows代码页时,它可以工作。

如果您的正则表达式引擎是纯unicode引擎,它将\x80读为\u0080,即控制代码。

In short, the Euro symbol is on 80h on all windows code pages. How your regex engine treats this may vary. It works when your regex engine is a 8bit and the text file is using a windows code page.
If your regex engine is a pure unicode one, it will read \x80 as \u0080 which is a control code.

那么,原生编码PCRE支持是什么意思?这是系统的依赖,您不应该依赖某些代码页。

So what do you mean with native encoding PCRE Support? This is system dependend and you should not rely on some code pages.

unicode的优点是您可以摆脱所有不同的代码页和所有问题

The advantage of unicode is that you can get rid of all the different code pages and all of the problems derived from that.

因此要使用unicode尝试匹配 \x {20AC}

So to use unicode for that try matching for \x{20AC} this is the unicode code point for the Euro symbol.

此处是 regular-expressions.info有关Unicode语法的信息

这篇关于PCRE编码支持的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆