多字节strtr()-> mb_strtr() [英] multibyte strtr() -> mb_strtr()

查看:100
本文介绍了多字节strtr()-> mb_strtr()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人写过函数strtr()的多字节变体吗?我需要这个.

Does anyone have written multibyte variant of function strtr() ? I need this one.

编辑1(所需用法示例):


Example:
$from = 'ľľščťžýáíŕďňäô'; // these chars are in UTF-8
$to   = 'llsctzyaiŕdnao';

// input - in UTF-8
$str  = 'Kŕdeľ ďatľov učí koňa žrať kôru.';
$str  = mb_strtr( $str, $from, $to );

// output - str without diacritic
// $str = 'Krdel datlov uci kona zrat koru.';

推荐答案

相信 strtr是多字节安全的,因为str_replace 多字节安全的,所以可以将其包装起来:

I believe strtr is multi-byte safe, either way since str_replace is multi-byte safe you could wrap it:

function mb_strtr($str, $from, $to)
{
  return str_replace(mb_str_split($from), mb_str_split($to), $str);
}

由于没有mb_str_split函数,您还需要编写自己的函数(使用mb_substrmb_strlen),或者您可以只使用

Since there is no mb_str_split function you also need to write your own (using mb_substr and mb_strlen), or you could just use the PHP UTF-8 implementation (changed slightly):

function mb_str_split($str) {
    return preg_split('~~u', $str, null, PREG_SPLIT_NO_EMPTY);;

}

但是,如果您要寻找一个功能来删除字符串中所有(拉丁字母)重音符号,则可能会发现以下功能有用:

However if you're looking for a function to remove all (latin?) accentuations from a string you might find the following function useful:

function Unaccent($string)
{
    return preg_replace('~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml|caron);~i', '$1', htmlentities($string, ENT_QUOTES, 'UTF-8'));
}

echo Unaccent('ľľščťžýáíŕďňä'); // llsctzyairdna
echo Unaccent('Iñtërnâtiônàlizætiøn'); // Internationalizaetion

这篇关于多字节strtr()-> mb_strtr()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆