字符向量的R排序规则是什么? [英] What are the R sorting rules of character vectors?
本文介绍了字符向量的R排序规则是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
R按我描述为字母而不是ASCII的顺序对字符向量进行排序.
R sorts character vectors in a sequence which I describe as alphabetic, not ASCII.
例如:
sort(c("dog", "Cat", "Dog", "cat"))
[1] "cat" "Cat" "dog" "Dog"
三个问题:
- 描述此排序顺序的技术上正确的术语是什么?
- 我在CRAN的手册中找不到对此的任何引用.在哪里可以找到有关R中排序规则的描述?
- 与其他语言(例如C,Java,Perl或PHP)中的这种行为有什么不同吗?
推荐答案
Details:
用于sort()
状态:
The sort order for character vectors will depend on the collating
sequence of the locale in use: see ‘Comparison’. The sort order
for factors is the order of their levels (which is particularly
appropriate for ordered factors).
和help(Comparison)
然后显示:
Comparison of strings in character vectors is lexicographicwithin
the strings using the collating sequence of the locale in use:see
‘locales’. The collating sequence of locales such as ‘en_US’ is
normally different from ‘C’ (which should use ASCII) and can be
surprising. Beware of making _any_ assumptions about the
collation order: e.g. in Estonian ‘Z’ comes between ‘S’ and ‘T’,
and collation is not necessarily character-by-character - in
Danish ‘aa’ sorts as a single letter, after ‘z’. In Welsh ‘ng’
may or may not be a single sorting unit: if it is it follows ‘g’.
Some platforms may not respect the locale and always sort in
numerical order of the bytes in an 8-bit locale, or in Unicode
point order for a UTF-8 locale (and may not sort in the same order
for the same language in different character sets). Collation of
non-letters (spaces, punctuation signs, hyphens, fractions and so
on) is even more problematic.
所以这取决于您的语言环境设置.
so it depends on your locale setting.
这篇关于字符向量的R排序规则是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文