PHP:在坏词混淆器中使用特殊字符 [英] PHP: Using special characters in bad word obfuscator

查看:86
本文介绍了PHP:在坏词混淆器中使用特殊字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在php中使用此错误词检测器/混淆器(以符合Adsense的要求).它显示坏词的第一个字母,并用以下字符替换其余字母:▪

I'm using this bad word detector/obfuscator in php (to be Adsense compliant). It shows the first letter of the bad word, and replaces the remaining letters with this character: ▪

工作正常,除非当我使用包含西班牙语特殊字符的单词时,例如ñ,á,ó等.

It works fine, except when I'm using words that contain special characters in Spanish, for example: ñ, á, ó, etc.

这是我当前的代码:

<?    
function badwords_full($string, &$bad_references) {
    static $bad_counter;
    static $bad_list;
    static $bad_list_q;
    if(!isset($bad_counter)) {
        $bad_counter = 0;
        $bad_list = badwords_list();
        $bad_list_q = array_map('preg_quote', $bad_list);
    }
    return preg_replace_callback('~('.implode('|', $bad_list_q).')~',
        function($matches) use (&$bad_counter, &$bad_references) {
            $bad_counter++;
            $bad_references[$bad_counter] = $matches[0];
            return substr($matches[0], 0, 1).str_repeat('&squf;', strlen($matches[0]) - 1);
    }, $string);
}

function badwords_list() {
    # spanish
    $es = array(
        "gallina",
        "ñoño"
    );

    # english
    $en = array(
        "chicken",
        "horse"
    );

    # join all languages
    $list = array_merge($es, $en);
    usort($list, function($a,$b) {
        return strlen($b) < strlen($b);
    });
    return $list;
}

$bad = []; //holder for bad words

测试1:

echo badwords_full('Hello, you are a chicken!', $bad);

结果1:

你好,你是一个c···········! (效果很好)

Hello, you are a c▪▪▪▪▪▪! (works fine)

测试2:

echo badwords_full('Hola en español eres un ñoño!', $bad);

结果2:

Hola enespañoler unes ·······!

Hola en español eres un �▪▪▪▪▪!

关于如何解决此问题的任何想法?谢谢!

Any ideas on how to solve this issue? Thanks!

推荐答案

您正在将一个多字节字符分成两半.使用 mb_substr 代替

You are splitting a multibyte character in half. Use mb_substr in place of substr.

return mb_substr($matches[0], 0, 1).str_repeat('&squf;', strlen($matches[0]) - 1);

https://3v4l.org/AnPJl

您可能还想使用 mb_strlen strlen .

You also probably want to use mb_strlen in place of strlen.

这篇关于PHP:在坏词混淆器中使用特殊字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆