如何通过php从帖子中删除令人反感的单词? [英] How to remove offensive words from post by php?

查看:48
本文介绍了如何通过php从帖子中删除令人反感的单词?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设"xyza"是一个坏词.我正在使用以下方法替换令人反感的单词-

Assume "xyza" is a bad word. I'm using following method to replace offensive words-

$text = str_replace("x***","(Offensive words detected & removed!)",$text);

此代码会将xyza替换为(检测到并删除了攻击性单词!)".

This code will replace xyza into "(Offensive words detected & removed!)".

但是,如果有人键入XYZA,我的代码无法检测到,则问题是"Case".该怎么解决?

But problem is "Case" if someone type XYZA my code can't detect it. How to solve it?

推荐答案

无论您做什么,用户都会找到解决过滤器的方法.他们将使用unicode字符(例如, ass 使用西里尔字母а,并且不会被任何正则表达式解决方案捕获).他们将使用空格,美元符号,星号,以及您尚未掌握的任何内容.

No matter what you do, users will find ways to get around your filters. They will use unicode characters (аss, for example, uses a Cyrillic а and will not get captured by any of the regex solutions). They will use spaces, dollar signs, asterisks, whatever you haven't managed to catch yet.

如果家庭友善对您的应用程序至关重要,请让他人在内容上线之前对其进行审查.否则,添加标记功能,以便其他人可以标记令人反感的内容.更好的是,使用某种机器学习或贝叶斯过滤器来自动标记可能令人反感的帖子,并让人员手动将其检出.人们在阅读人类语言方面比计算机更好.

If family-friendliness is essential to your application, have a person review the content before it goes live. Otherwise, add a flag feature so other people can flag offensive content. Better yet, use some sort of machine learning or Bayesian filter to automatically flag potentially offensive posts and have humans check them out manually. People read human languages better than computers.

这篇关于如何通过php从帖子中删除令人反感的单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆