如何使用分配排序(基数排序等),以字符串进行排序? [英] How to use distribution sort (radix sort, etc) to sort strings?

查看:467
本文介绍了如何使用分配排序(基数排序等),以字符串进行排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道如何使用基数排序排序整数。

I know how to use radix sort to sort integers.

但如何用它来串排序?或浮点数?

But how to use it to sort strings? or float numbers?

推荐答案

基数排序或任何其他分销排序可能会被用来排序浮点数,如果你忽略他们的一些特殊性像无穷大,不是非数字值和两个不同的重$ P $零psentations。 IEEE 754-2008 浮点数有二重presentations,在排序顺序兼容与整数。因此,如果排除不是-A-号码和reinter preT 浮动 INT32 的Int64 ,您可以直接分配任何种类适用于他们。 编辑:负浮点数需要特殊处理(如指出AShelly),因为它们的排列顺序是相反的整数的排序顺序

Radix sort or any other distribution sort may be used to sort floating point numbers if you ignore some peculiarities of them like infinity, not-a-number values and two different representations of zero. IEEE 754-2008 floating point numbers have binary representations, compatible in sorting order with integer numbers. So, if you exclude not-a-numbers and reinterpret float or double as int32 or int64, you can directly apply any distribution sort to them. Negative floating point numbers need special treatment (as pointed out by AShelly), because their sorting order is opposite to the sorting order of integer numbers.

使用字符串,这是因为它们的长度可变的更加困难。其他种类的分布的排序(桶排序)可被使用,并且通常用于字符串。字符串的几个起始字符用于桶的索引,则任何种类的比较被用于将水桶内的字符串进行排序。

With strings, it is more difficult because of their variable length. Other kind of distribution sort (bucket sort) may be used and is often used for strings. Several starting characters of the string are used for bucket indexing, then any comparative sort is used to sort strings inside the buckets.

如果所有字符串具有几乎相等的长度和/或某些技术被用于扩增字符串之间的差异(像的FAST:快速架构无关树搜索对现代CPU和GPU),然后基数排序可以作为很好:分割字符串的字符组(或更好,比特长度相等的团体),reinter preT这些团体为整数,并继续就好像它是基数排序为整数。

If all strings have almost equal length and/or some technique is used to amplify differences between strings (like described in chapter 6 of "FAST: Fast Architecture Sensitive Tree Search on Modern CPUs and GPUs"), then radix sort may be used as well: split the string to groups of characters (or better, to groups of bits) of equal length, reinterpret these groups as integers, and continue as if it is radix sort for integers.

编辑:所有种类分布的种类,保证正确地只为ASCII字符串工作。其他字符串编码可能需要不同的排序顺序或依赖本地化的分页的参数。

All kinds of distribution sort are guaranteed to work properly only for ASCII strings. Other string encodings may require different sort order or may depend on the "collate" parameter of the locale.

这篇关于如何使用分配排序(基数排序等),以字符串进行排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆