php中的unicode preg_replace问题 [英] unicode preg_replace problem in php

查看:110
本文介绍了php中的unicode preg_replace问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有字符串

$result = "bei einer Temperatur, die etwa 20 bis 60°C unterhalb des Schmelzpunktes der kristallinen Modifikation"

直接来自MySQL表.该表和php标头都设置为UTF-8

which comes straight from a MySQL table. The table, and the php headers are both set to UTF-8

我要去除度"符号: http://en.wikipedia.org/wiki /Degree_symbol 并将其替换为"degrees"一词即可得到:

I want to strip the 'degree' symbol: http://en.wikipedia.org/wiki/Degree_symbol and replace it with the word 'degrees' to get:

北温度,从20到60摄氏度,向克里斯塔琳·施泰因克林肯大学申请修改"

"bei einer Temperatur, die etwa 20 bis 60degreesC unterhalb des Schmelzpunktes der kristallinen Modifikation"

但是我无法使其与preg_replace一起使用.

but I can't get it to work with preg_replace.

如果我这样做:

$result = preg_replace('/\xB0/u'," degrees ", $result ); - I get an empty string

如果我这样做::

$result = preg_replace('/\u00B0/u'," degrees ", $result ); - I get the error:

警告:preg_replace()[function.preg-replace]:编译失败:PCRE不支持/var/www/html/includes中偏移量为1的\ L,\ l,\ N,\ U或\ u /classes/redeyeTable.inc.php,第75行

Warning: preg_replace() [function.preg-replace]: Compilation failed: PCRE does not support \L, \l, \N, \U, or \u at offset 1 in /var/www/html/includes/classes/redeyeTable.inc.php on line 75

我对编码不是很好...我在这里做错什么了?

I'm not great with encodings... what am I doing wrong here?

推荐答案

使用

$result = preg_replace('/\x{00B0}/u'," degrees ", $result );

请参阅

Please see here for more information on the \x{FFFF}-syntax.

请务必注意\xB0\x{00B0}之间的区别:

It's important to note the difference between \xB0 and \x{00B0}:

  • \xB0表示一个具有十六进制代码B0(十进制176)的单个字符,例如ISO-8859-1中的度符号(°)
  • \x{00B0}表示unicode码点U+00B0,该点描述unicode系统中的度数符号(°).使用UTF-8编码时,将使用两个字节\xC2\xB0对该代码点进行编码.
  • \xB0 denotes a single character with hex-code B0 (176 decimal) which is the degree symbol (°) in ISO-8859-1 for example
  • \x{00B0} denotes the unicode codepoint U+00B0 which describes the degree symbol (°) in the unicode system. This codepoint will be encoded using two bytes \xC2\xB0 when using UTF-8 encoding.

这篇关于php中的unicode preg_replace问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆