PHP中的错误过滤器? [英] Badword filter in PHP?

查看:73
本文介绍了PHP中的错误过滤器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用PHP编写一个坏词过滤器.

I am writing a badword filter in PHP.

我在数组中有一个badwords列表,方法cleanse_text()的编写如下:

I have a list of badwords in an array and the method cleanse_text() is written like this:

public static function cleanse_text($originalstring){
   if (!self::$is_sorted) self::doSort();
   return str_ireplace(self::$badwords, '****', $originalstring);
}

对于完全匹配,这很简单,但是我也想审查伪装成'ab * d'的单词,其中'abcd'是一个不好的单词.事实证明这要困难一些.

This works trivially, for exact matches, but I wanted to also censor words that have been disguised like 'ab*d' where 'abcd' is a bad word. This is proving to be a bit more difficult.

这是我的问题:

  1. 这是一个值得困扰的错误词过滤器(这是一个面向专业人士的网站,因此需要一定的最低礼节-我本以为如此)

  1. Is a badword filter worth bothering with (it is a site for professionals so a certain minimum decorum is required - I would have thought)

是否值得尝试捕获像'f * ck'这样的明显变通方法-还是我不应该尝试将其过滤掉.

Is it worth the hustle of trying to capture obvious work arounds like 'f*ck' - or should I not attempt to filter those out.

是否有更好的方法编写上面的cleanse_text()方法?

Is there a better way of writing the cleanse_text() method above?

推荐答案

我绝对不会理会它.

  1. 这是专业人士的网站,因此您可以假定他们会采取适当的行动.某些规则的节制和执行将使任何人排队.例如,查看堆栈溢出.即使没有社区调节工具,也可能会迫使人们行为举止适当.

  1. It's a site for professionals, so you can assume that they will act appropriately. Some moderation and enforcement of rules will put anyone in line. Look at Stack Overflow for example. Even without the community moderation tools, people can be pressured into behaving appropriately.

这将失败..误报过多("clbuttic"),并且不可能列出包含所有可能的脏话的列表.替换某些字母(例如:f*ck)可以使它更具有攻击性.完全删除单词会破坏含义,这对于误报是一个巨大的问题.

It would fail. There would be too many false positives ("clbuttic"), and making a list which contained all possible swear words would be impossible to maintain. Replacing certain letters (eg: f*ck) makes it no less offensive. Removing the word altogether destroys meaning, which is a huge problem with false positives.

考虑有关驴和鸟的讨论.都是关于驴,山雀,笨蛋和公鸡的.

Consider a discussion about donkeys and birds. It'd be all about asses, tits, boobies and cocks.

这篇关于PHP中的错误过滤器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆