在没有 php 的 MYSQL 中原生编写的坏词过滤器 [英] bad word filter written native in MYSQL without php
问题描述
由于速度问题,我正在考虑完全用 MySQL 编写的坏词过滤器,但在我的搜索中,我只找到了 MySQL 替换功能.
Because of speed I'm thinking about a bad word filter written totally in MySQL but on my search I only found the MySQL replace function.
REPLACE(string_column, 'search', 'replace')
但是有了这个功能,我只能一个字一个字地替换.MySQL 中是否有 String 函数可以检查整个字符串并替换和搜索表中的多个值?(使用 php 我完全清楚如何完成这个简单的任务)
But with this function I only can replace word by word. Is there a String function in MySQL which can check the whole string and replace and search multiple values from a table? (with php I'm totally clear how to do this easy task)
MySQL 循环是一个合理的解决方案吗?
Is a MySQL loop a reasonable solution?
我对每一个提示都很满意.
I'm happy about every hint.
推荐答案
我将其作为新答案发布,因为我在这里使用了不同的技术.我想我们可以只使用 MySQL 函数和 BEFORE INSERT 触发器.拆分字符串的函数取自来自其他答案.
I'm posting it as a new answer, since I'm using a different technique here. I'm thinking we can just use a MySQL function and a BEFORE INSERT trigger. The function to split a string is taken from this other answer.
CREATE FUNCTION strSplit(x VARCHAR(1000), delim VARCHAR(12), pos INTEGER)
RETURNS VARCHAR(1000)
BEGIN
DECLARE output VARCHAR(1000);
SET output = REPLACE(SUBSTRING(SUBSTRING_INDEX(x, delim, pos)
, CHAR_LENGTH(SUBSTRING_INDEX(x, delim, pos - 1)) + 1)
, delim
, '');
IF output = '' THEN SET output = null; END IF;
RETURN output;
END
和 INSERT 触发器将是这样的:
and the INSERT trigger would be like this:
CREATE TRIGGER change_words
BEFORE INSERT ON sentences
FOR EACH ROW
BEGIN
DECLARE i INT;
DECLARE s VARCHAR(1000);
DECLARE r VARCHAR(1000);
SET i = 1;
SET s = '';
REPEAT
SET s = (
SELECT
REPLACE(split, COALESCE(bad, ''), good)
FROM
(SELECT strSplit(new.sentence, ' ', i) AS split) s
LEFT JOIN words w ON s.split = w.bad
LIMIT 1
);
SET r = CONCAT_WS(' ', r, s);
SET i = i + 1;
UNTIL s IS NULL
END REPEAT;
SET new.sentence = r;
END
这会更快,因为当您将句子插入数据库时,它只会转换一次.还有一些我们需要改进,和以前一样:
this will be faster, since the sentence will be converted only once, when you insert it in the database. Still there are some improvements that we need, same as before:
LEFT JOIN words w ON s.split = w.bad
它不会匹配包含分隔符 , 的单词.!?和替换功能
it won't match words containing separator , . ! ? and the replace function
REPLACE(split, COALESCE(bad, ''), good)
将区分大小写.如果你愿意,它可以很容易地修复.请在此处查看小提琴.
will be case sensitive. It can be fixed qute easily if you wish. Please see a fiddle here.
这篇关于在没有 php 的 MYSQL 中原生编写的坏词过滤器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!