排序单词数组-非英语字母+双字符字母PHP [英] Sort array of words - non-english letters + double character letters PHP
问题描述
我想按字母顺序对单词数组进行排序.不幸的是,在我的语言(克罗地亚语)中,存在双字符字母(例如lj,nj,dž)和未使用php sort
函数正确排序的字母(例如č,ć,ž,š,đ).
I want to sort an array of words alphabetically. Unfortunately, in my language (Croatian), there are double-character letters (e.g. lj, nj, dž), and letters that are not properly sorted with php sort
function (e.g. č, ć, ž, š, đ).
以下是正确排序的克罗地亚字母(以及一些英语字母):
Here is the Croatian alphabet properly ordered (with some English letters aswell):
$alphabet = array(
'a', 'b', 'c',
'č', 'ć', 'd',
'dž', 'đ', 'e',
'f', 'g', 'h',
'i', 'j', 'k',
'l', 'lj', 'm',
'n', 'nj', 'o',
'p', 'q', 'r',
's', 'š', 't',
'u', 'v', 'w',
'x', 'y', 'z', 'ž'
);
这是单词列表,也正确排序:
And here is a list of words, also properly ordered:
$words = array(
'alfa', 'beta', 'car', 'čvarci', 'ćup', 'drvo', 'džem', 'đak', 'endem', 'fićo', 'grah', 'hrana', 'idealan', 'jabuka', 'koza', 'lijep', 'ljestve', 'mango',
'nebo', 'njezin', 'obrva', 'pivnica', 'qwerty', 'riba', 'sir', 'šaran', 'tikva', 'umanjenica', 'večera', 'wind', 'x-ray', 'yellow', 'zakaj', 'žena'
);
我正在考虑对它进行排序的方法.一种方法是将每个单词分解为字母.由于由于字母多而导致我不知道该怎么做,因此我提出了一个问题,并得到了一个很好的答案,该问题得以解决(
I was thinking of ways to sort it. One way was to split each word into letters. Since I didn't know how to do that because of multicharacter letters, I asked a question and got a good answer which solved that problem (see here). So I looped through the array and split each word into letters using the code provided by best answerer.
When the array was looped I had a new array (let's name it $words_splitted
). Elements of that array were arrays aswell, each representing a word.
Array
(
[0] => Array
(
[0] => a
[1] => l
[2] => f
[3] => a
)
[1] => Array
(
[0] => b
[1] => e
[2] => t
[3] => a
)
[2] => Array
(
[0] => c
[1] => a
[2] => r
)...
...[16] => Array
(
[0] => lj
[1] => e
[2] => s
[3] => t
[4] => v
[5] => e
)
想法是通过$alphabet
变量的索引值比较每个数组的每个字母.例如,将$words_splitted[0][0]
与$words_splitted[1][0]
进行比较,然后与$words_splitted[2][0]
等进行比较.如果我们比较字母'a'和'b',则字母'a'在$alphabet
变量中的索引号较小,因此它在"b"之前.
The idea was to compare each letter of each array by the index value of $alphabet
variable. For example, $words_splitted[0][0]
would be compared with $words_splitted[1][0]
, and then with $words_splitted[2][0]
, etc. If we compare letters 'a' and 'b', letter 'a' has smaller index number in $alphabet
variable, so it comes before 'b'.
不幸的是,我被卡住了……我不确定该怎么做.有什么想法吗?
Unfortunately, I got stuck...and I'm not sure how to do this. Any ideas?
注意:不应使用PHP扩展名.
NOTE: PHP extensions shouldn't be used.
推荐答案
以下是可帮助您根据特定字母字符表对字符串数组进行排序的类:
Here is a class that can help you sort array of strings based on a specific alphabet characters table:
<?php
/**
* This class can be used to compare unicode strings.
* It can be used for easy array sorting.
*
* You can set your own alphabet characters table to be used.
*/
class UnicodeStringComperator {
private $alphabet = [];
public function __construct() {
// We set the default alphabet characters table to a-z.
$this->alphabet = range('a', 'z');
}
/**
* Set the characters table to use for sorting
*
* @param array $alphabet The characters table for the sorting
*/
public function setAlphabet($alphabet) {
$this->alphabet = $alphabet;
}
/**
* Split the string into an array of the characters
*
* @param string $str The string to split
* @return array The array of the characters characters in the string
*/
public function splitter($str){
return preg_split('//u', $str, -1, PREG_SPLIT_NO_EMPTY);
}
/**
* Find the place of the char in the alphabet table
*
* @param string $chr The character to find
* @return mixed the place of the char in the table or NULL if not found
*/
public function place($chr) {
return array_search($chr, $this->alphabet);
}
/**
* Do the comparison between the 2 strings
*
* @param string $str1 The first
* @param string $str2 The first
* @return int The values -1, 0, 1 if $str1 < $str2, $str1 == $str2 or $str1 > $str2 accordingly
*/
public function compare($str1, $str2) {
$chars1 = $this->splitter($str1);
$chars2 = $this->splitter($str2);
for ($i = 0; $i < count($chars1) && $i < count($chars2); $i++) {
$p1 = $this->place($chars1[$i]);
$p2 = $this->place($chars2[$i]);
if ($p1 < $p2) {
return -1;
} elseif ($p1 > $p2) {
return 1;
}
}
if (count($chars1) <= count($chars2)) {
return -1;
}
return 0;
}
/**
* Sort an array of strings based on the alphabet table
*
* @param Array $ar The array of strings to sort
* @return Array The sorted array.
*/
public function sort_array($ar) {
usort($ar, array('self', 'compare'));
return $ar;
}
}
要与您的特定字母一起使用,可以使用setAlphabet
函数来配置自己的字符排序表:
To use with your specific alphabet you can use the setAlphabet
function to configure your own characters-sort-table:
<?php
$alphabet = array(
'a', 'b', 'c',
'č', 'ć', 'd',
'dž', 'đ', 'e',
'f', 'g', 'h',
'i', 'j', 'k',
'l', 'lj', 'm',
'n', 'nj', 'o',
'p', 'q', 'r',
's', 'š', 't',
'u', 'v', 'w',
'x', 'y', 'z', 'ž'
);
$comperator = new UnicodeStringComperator();
$comperator->setAlphabet($alphabet);
$sorted_words = $comperator->sort_array($words);
var_dump($sorted_words);
输出是您的原始数组:
array(34) {
[0] =>
string(4) "alfa"
[1] =>
string(4) "beta"
[2] =>
string(3) "car"
[3] =>
string(7) "čvarci"
[4] =>
string(4) "ćup"
[5] =>
string(4) "drvo"
[6] =>
string(5) "džem"
[7] =>
string(4) "đak"
[8] =>
string(5) "endem"
[9] =>
string(5) "fićo"
[10] =>
string(4) "grah"
[11] =>
string(5) "hrana"
[12] =>
string(7) "idealan"
[13] =>
string(6) "jabuka"
[14] =>
string(4) "koza"
[15] =>
string(5) "lijep"
[16] =>
string(7) "ljestve"
[17] =>
string(5) "mango"
[18] =>
string(4) "nebo"
[19] =>
string(6) "njezin"
[20] =>
string(5) "obrva"
[21] =>
string(7) "pivnica"
[22] =>
string(6) "qwerty"
[23] =>
string(4) "riba"
[24] =>
string(3) "sir"
[25] =>
string(6) "šaran"
[26] =>
string(5) "tikva"
[27] =>
string(10) "umanjenica"
[28] =>
string(7) "večera"
[29] =>
string(4) "wind"
[30] =>
string(5) "x-ray"
[31] =>
string(6) "yellow"
[32] =>
string(5) "zakaj"
[33] =>
string(5) "žena"
}
这篇关于排序单词数组-非英语字母+双字符字母PHP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!