如何考虑阿拉伯语语音删除重复的行 [英] How to remove duplicate row considering the Arabic Phonetics

查看:87
本文介绍了如何考虑阿拉伯语语音删除重复的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张阿拉伯语表格.我想删除重复的行.鉴于阿拉伯语的符号:َ ُ

我的桌子:vocabulary

+----+----------+--------------------------------+
| id |   word   |              mean              |
--------------------------------------------------
| 1 |    سِلام    |              xxx               |
--------------------------------------------------
| 2 |    سَلام    |              xxx               |
--------------------------------------------------
| 3 |    سلام    |              xxx               |
--------------------------------------------------
| 4 |    سلام    |              xxx               |
+------------------------------------------------+  

现在我想要这张桌子:

+----+----------+--------------------------------+
| id |   word   |              mean              |
--------------------------------------------------
| 1 |    سِلام    |              xxx               |
--------------------------------------------------
| 2 |    سَلام    |              xxx               |
--------------------------------------------------
| 3 |    سلام    |              xxx               |
+------------------------------------------------+

我该怎么做?!

我的尝试:

$result = mysql_query( "SELECT * FROM vocabulary where");
while($end = mysql_fetch_assoc($result)){

    $word = $end["word"];
    $mean = $end["mean"];
    $id = $end["id"];


$result2 = mysql_query( "SELECT * FROM vocabulary where word='$word' AND mean='$mean'");
$TotalResults = mysql_num_rows($result2);

if($TotalResults>1){

     mysql_query( "DELETE FROM vocabulary WHERE id='$id'");
}

摘要::如何使MySQL对阿拉伯符号敏感?

解决方案

有多种方法可以实现这一目标.

1-您可以从数据库中选择行,循环遍历并将"word"标题保存在数组中,并且在循环的每次迭代中,您都可以检查类似的值是否为in_array().如果该值存在,则可以将ID保存在另一个数组中,然后使用这些ID从数据库中删除.

2-提取ID的另一种方法是使用类似于以下内容的查询:

select count(*), id from table group by title

然后您可以遍历结果并删除co​​unt大于1的行(使用ID).

(和其他方法)的基本概念是,您只需要匹配字符串即可.字母上的语音会更改实际的字符串,因此"سَلام"不等于سلام".

另一方面,这里有一个很棒的阿拉伯语PHP库,可用于各种与阿拉伯语相关的字符串操作: PHP和阿拉伯语语言.

这种方式只会删除一个重复项.

还有其他几种方法,这完全取决于您拥有的数据集的大小,并且删除这些重复项是一次性的还是经常的,因为您必须牢记性能./p>

I have a table of Arabic text. I want to remove duplicate rows. In view of the symbols in Arabic language: َ ِ ُ

My table: vocabulary

+----+----------+--------------------------------+
| id |   word   |              mean              |
--------------------------------------------------
| 1 |    سِلام    |              xxx               |
--------------------------------------------------
| 2 |    سَلام    |              xxx               |
--------------------------------------------------
| 3 |    سلام    |              xxx               |
--------------------------------------------------
| 4 |    سلام    |              xxx               |
+------------------------------------------------+  

Now i want this table:

+----+----------+--------------------------------+
| id |   word   |              mean              |
--------------------------------------------------
| 1 |    سِلام    |              xxx               |
--------------------------------------------------
| 2 |    سَلام    |              xxx               |
--------------------------------------------------
| 3 |    سلام    |              xxx               |
+------------------------------------------------+

How can i do that ?!

My Try:

$result = mysql_query( "SELECT * FROM vocabulary where");
while($end = mysql_fetch_assoc($result)){

    $word = $end["word"];
    $mean = $end["mean"];
    $id = $end["id"];


$result2 = mysql_query( "SELECT * FROM vocabulary where word='$word' AND mean='$mean'");
$TotalResults = mysql_num_rows($result2);

if($TotalResults>1){

     mysql_query( "DELETE FROM vocabulary WHERE id='$id'");
}

Summary: How can I sensitive MySQL to the Arabic symbols ?

解决方案

There are multiple ways to achieve this.

1- You can either select your rows from the database, loop through them and save the 'word' title in an array, and in each iteration in the loop, you can check if a similar value is in_array(). If the value exists, then you can save the id in another array and then use these ids to delete from the database.

2- Another way to extract the ids is to use a query similar to the below:

select count(*), id from table group by title

You can then loop through the results and delete the row (using the ids) where count is greater than 1.

The basic concept in both (and other methods) is that you just have to match the strings. Phonetics on letters change the actual string so "سَلام" is not equal to "سلام".

On a side note, there is a great Arabic PHP library you can use for various Arabic related string manipulation: PHP and Arabic Language.

This way will only remove one duplicate.

There are several other ways to do it, and it all depends on the size of the data set you have and if deleting these duplicates is a one time thing or a frequent thing because you will have to keep performance in mind.

这篇关于如何考虑阿拉伯语语音删除重复的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆