php preg_grep和umlaut / accent [英] php preg_grep and umlaut/accent

查看:149
本文介绍了php preg_grep和umlaut / accent的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含术语的数组,其中一些包含重音字符。我这样做一个preg grep

I have an array that consists of terms, some of them contain accented characters. I do a preg grep like this

$data= array('Napoléon','Café');
$result = preg_grep('~' . $input . '~i', $data);

因此,如果用户键入'le',我也想要结果Napoléon

So if user type in 'le' I would also want the result 'Napoléon' to be matched, which does not work with the ablove command.

我做了一些搜索,发现这个函数可能是相关的

I did some searching and found that this function might be relevant

preg_match("/[\w\pL]/u",$var);

如何组合这些并使其工作?

How can I combine these and make it work?

推荐答案

这是不可能的正则表达式模式。这不是因为你不能告诉正则表达式引擎匹配所有e和类似。但是,可以首先对输入数据(数组和搜索输入)进行归一化,然后搜索归一化数据,但返回非归一化数据的结果。

This is not possible with a regular expression pattern only. It is not because you can not tell the regex engine to match all "e" and similars. However, it is possible to first normalize the input data (both the array as well as the search input) and then search the normalized data but return the results for the non-normalized data.

在下面的例子中,我使用音译进行这种标准化,我想这是你正在寻找的:

In the following example I use transliteration to do this kind of normalization, I guess that is what you're looking for:

$data = ['Napoléon', 'Café'];

$result = array_translit_search('le', $data);
print_r($result);

$result = array_translit_search('leó', $data);
print_r($result);

示范输出是:

Array
(
    [0] => Napoléon
)
Array
(
    [0] => Napoléon
)

搜索功能本身相当直接,如上所述, ,执行 preg_grep ,然后返回原始输入匹配:

The search function itself is rather straight forward as written above, transliterating the inputs, doing the preg_grep and then returning the original intputs matches:

/**
 * @param string $search
 * @param array $data
 * @return array
 */
function array_translit_search($search, array $data) {

    $transliterator = Transliterator::create('ASCII-Latin', Transliterator::REVERSE);
    $normalize      = function ($string) use ($transliterator) {

        return $transliterator->transliterate($string);
    };

    $dataTrans   = array_map($normalize, $data);
    $searchTrans = $normalize($search);
    $pattern     = sprintf('/%s/i', preg_quote($searchTrans));
    $result      = preg_grep($pattern, $dataTrans);
    return array_intersect_key($data, $result);
}

此代码需要 Transliterator ,您可以用任何其他类似的音译或翻译功能替换它。

This code requires the Transliterator from the Intl extension, you can replace it with any other similar transliteration or translation function.

我不能建议使用 str_replace 这里btw。如果你需要回退到翻译表, a href =http://php.net/strtr =nofollow> strtr 。这就是你要找的。但我更喜欢一个图书馆带来翻译与它自己,特别是如果它是国际库,你通常不能击败它。

I can not suggest to use str_replace here btw., if you need to fall-back to a translation table, use strtr instead. That is what you're looking for. But I prefer a library that brings the translation with it's own, especially if it's the Intl lib, you normally can't beat it.

这篇关于php preg_grep和umlaut / accent的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆