我需要什么正则表达式模式? [英] What regex pattern do I need for this?

查看:44
本文介绍了我需要什么正则表达式模式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个正则表达式(在PHP中工作),以用英式英语单词替换HTML中的美式英语单词.因此颜色将被颜色替换,米将被米替换,等等[我知道米也是英式英语单词,但是对于复制品,我们将始终使用距离单位而不是测量设备来表示].在以下(略作作弊的)示例中,该模式将需要正确工作(尽管由于我无法控制实际的输入,所以这些示例可能会存在):

I need a regex (to work in PHP) to replace American English words in HTML with British English words. So color would be replaced by colour, meters by metres and so on [I know that meters is also a British English word, but for the copy we'll be using it will always be referring to units of distance rather than measuring devices]. The pattern would need to work accurately in the following (slightly contrived) examples (although as I have no control over the actual input these could exist):

<span style="color:red">This is the color red</span>

[不应替换HTML标记中的颜色,而应替换句子中的颜色]

[should not replace color in the HTML tag but should replace it in the sentence]

<p>Color: red</p>

[应替换单词]

<p>Tony Brammeter lives 2000 meters from his sister</p>

[应该用单词代替米,但不要用名字代替

[should replace meters for the word but not in the name]

我知道在某些情况下替换无效(例如,如果他的名字叫托尼·米特(Tony Meter)的话),但是这些情况很少见,我们在出现这些情况时可以加以处理.

I know there are edge cases where replacement wouldn't be useful (if his name was Tony Meter for example), but these are rare enough that we can deal with them when they come up.

推荐答案

HTML/xml不应使用正则表达式处理,真的很难生成与 dom扩展并递归处理字符串:

Html/xml should not be processed with regular expressions, it is really hard to generate one that will match anything. But you can use the builtin dom extension and process your string recursively:

# Warning: untested code!
function process($node, $replaceRules) {
    foreach ($node->children as $childNode) {
        if ($childNode instanceof DOMTextNode) {
            $text = pre_replace(
                array_keys(replaceRules),
                array_values($replaceRules),
                $childNode->wholeText
            );
            $node->replaceChild($childNode, new DOMTextNode($text));
        } else {
            process($childNode, $replaceRules);
        }
    }
}
$replaceRules = array(
    '/\bcolor\b/i' => 'colour',
    '/\bmeter\b/i' => 'metre',
);
$doc = new DOMDocument();
$doc->loadHtml($htmlString);
process($doc, $replaceRules);
$htmlString = $doc->saveHTML();

这篇关于我需要什么正则表达式模式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆