在sqlite中使用字符串的规范化版本 - 波兰字符ł [英] Making normalized version of string in sqlite - polish character ł
问题描述
Apple提供了在数据库中使用规范化版本的文本创建额外列的示例:
DerivedProperty
Apple delivers example of making additional column in database with normalized version of text stored in database: DerivedProperty
有函数normalizeString,其中包含代码:
There is function normalizeString which contains code:
NSMutableString *result = [NSMutableString stringWithString:unprocessedValue];
CFStringNormalize((CFMutableStringRef)result, kCFStringNormalizationFormD);
CFStringFold((CFMutableStringRef)result, kCFCompareCaseInsensitive | kCFCompareDiacriticInsensitive | kCFCompareWidthInsensitive, NULL);
我测试了这个方法,并且有文本转换为正规化版本的例子:
ąĄćłŁÓŻźŃĘęĆ
- > aacłłozzneec
I've tested this method and there is example of conversion of text to normalized version:
ąĄćłŁÓŻźŃĘęĆ
-> aacłłozzneec
所有变音符号除了以下字符外,其他字符均已更改:łŁ
all diacritic characters were changed properly except characters: łŁ
是否有其他选项可进行正确的归一化?
Is there any other option to make proper normalization?
推荐答案
我不会说波兰语,所以我的回答可能是非常错误,但根据 http://www.unicode.org/Public/6.2.0/ucd/UnicodeData.txt ,字符ł和Ł是
I don't speak polish, so my answer may be terribly wrong, but according to http://www.unicode.org/Public/6.2.0/ucd/UnicodeData.txt the characters "ł" and "Ł" are not combinations of an "ordinary" character with a diacritical mark.
Unicode数据文件中ą的条目是
The entry for "ą" in the Unicode Data file is
0105;LATIN SMALL LETTER A WITH OGONEK;Ll;0;L;0061 0328;;;;N;LATIN SMALL LETTER A OGONEK;;0104;;0104
,第六字段0061 0328表示ą可以分解为a和U + 0328(COMBINING OGONEK)
and the sixth field "0061 0328" shows that "ą" can be decomposed into "a" and U+0328 (COMBINING OGONEK).
但是ł和Ł的条目为
0141;LATIN CAPITAL LETTER L WITH STROKE;Lu;0;L;;;;;N;LATIN CAPITAL LETTER L SLASH;;;0142;
0142;LATIN SMALL LETTER L WITH STROKE;Ll;0;L;;;;;N;LATIN SMALL LETTER L SLASH;;0141;;0141
其中第六个字段为空,因此这些字符没有分解。
where the sixth field is empty, so these characters do not have a decomposition.
因此,我怀疑将有任何函数规范化ł到l,你
必须使用
Therefore I doubt that there will be any function that normalizes "ł" into "l", and you would have to do that using
[result replaceOccurrencesOfString:@"ł" withString:@"l" options:0 range:NSMakeRange(0, [result length])];
这篇关于在sqlite中使用字符串的规范化版本 - 波兰字符ł的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!