SQL Server的自然排序 [英] Natural sort for SQL Server

查看:109
本文介绍了SQL Server的自然排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

之前也曾提出过类似的问题,但始终具有特定的数据质量,可以使用更具针对性的拆分并按此部分进行排序"的方法,如果您不知道数据的结构,该方法将不起作用坦率地说,列中的数据甚至是列中的数据.换句话说,不是通用的自然"排序顺序-大致等于SELECT * FROM [parts] ORDER BY [part_category] DESC, [part_number] NATURAL DESC

Similar questions have been asked before, but always have specific qualities to the data which allow a more targeted "split it up and just sort by this part" approach, which does not work when you don't know the structure of the data in the column - or even the column, frankly. In other words, not a generic, "Natural" sort order - something roughly equivalent to SELECT * FROM [parts] ORDER BY [part_category] DESC, [part_number] NATURAL DESC

我在C#中有一个DataView,它具有一个Sort参数用于指定ADO将使用的ORDER BY,并且要求使用自然"排序算法按列进行排序.从理论上讲,我可以做任何事情,从创建一个不同的列进行排序(基于我希望自然排序"的列)到不对SQL进行排序,而是对之后的代码集进行排序.我正在寻找灵活性,效率,准备工作和可维护性之间的最佳平衡.检索(在C#中)或完全在存储过程中对数据进行排序后,我会从中受益.

I have a DataView in C# that has a Sort parameter for specifying the ORDER BY that would be used by ADO, and a requirement to sort by a column using a 'natural' sort algorithm. I could in theory do just about anything from creating a different column to sort by (based on the column I'd like to have 'sorted naturally') to not sorting in SQL, but rather sorting the result set in code afterwards. I'm looking for the best balance of flexibility, efficiency, preparation effort, and maintainability. I would benefit somewhat from being able to sort such data after retrieval (in C#) or completely within a stored procedure.

在我看来,根据到目前为止的客户声明,自然"排序顺序将意味着等同地对待大写和小写字母,并考虑数字的幅值,而不是ASCII值的数字(即x90之前 x100). Jeff Atwood在对此进行了相当不错的讨论,但是没有解决SQL排序.就是说,这些是我的想法:

In my mind, and according to customer statements so far, 'Natural' sort order will mean treating upper and lower case letters equivalently, and considering the magnitude of a number, rather than the ASCII value of its digits (that is x90 comes before x100). Jeff Atwood had a pretty decent discussion of this, but it didn't address SQL sorting. That said, these are my thoughts:

  • 在结合幅度感知的同时还保留按字母顺序按字母顺序对字母字符进行排序的功能可能也很方便
  • 无论
  • ,非字母数字字符都可能必须按ASCII字母顺序排序
  • 小数点感知可能比其应付出的努力还要多,因为字母数字字段中的大多数时间段和逗号都被视为标点符号/分隔符,并且仅在代表浮点字段时才表示小数部分
  • Incorporating the magnitude awareness while also retaining the ability to sort alpha characters ASCII-betically may also come in handy
  • Non-alphanumeric characters would probably have to be sorted ASCII-betically regardless
  • Decimal point awareness might be more effort than it's worth, since most of the time periods and commas in alphanumeric fields are treated as merely punctuation/separators, and only denote fractional portions when they're representing a float field

什么是实现SQL自然排序算法的合理灵活,通用,高效的方法?权衡利弊是最好的方法?还有其他选择吗?

What is a reasonably flexible, reasonably generic, reasonably efficient, approach to implementing a natural sort algorithm for SQL? Weighing the pros and cons, which is the best approach? Is there another option?

  • 是否存在本机SQL方式来实现ORDER BY [field] NATURAL DESC或其他功能?
  • 用于创建排序等效项"的PURE SQL函数-可用于创建某种第二个(可能是索引的)排序值"列,或者从存储过程中调用,或者在"ORDER BY"子句中指定-但是如何有效地编写呢? (循环?根本有一套基于集合的解决方案吗?)
  • CLR SQL函数-纯SQL函数的可用性优势,但是使用过程语言,如C#(算法应该没问题,但是可以使其比纯SQL排序[基于集合?]实现要快吗?)同样,如果足够有效,可以在C#中引用和利用.
  • 避免使用SQL Server-因为在各种其他字符中解析任意数量的数字实际上最适合循环或递归,而T-SQL不太适合循环或递归(尽管从技术上讲受支持,我所看到的是不要用圈!!!"和"CTE更糟糕!!")
  • SQL(??)中的某种比较器-看起来SQL本身不适合这种排序,而且我看不到指定要使用的比较器的方法-所以我猜这行不通...
  • Is there a native SQL way to ORDER BY [field] NATURAL DESC or something?
  • PURE SQL function to create a 'sort equivalent' - Could be used to create some sort of second, possibly indexed, 'sort value' column, or called from a stored procedure, or specified in an 'ORDER BY' clause - but how to write it efficiently? (loops? is there a set based solution at all??)
  • CLR SQL Function - usability benefits of pure SQL function, but using procedural language, like C# (algorithm should be no problem, but can it be made to go faster than a pure SQL sort [set based??] implementation?) Also, could be referenced and utilized in C# if efficient enough.
  • Avoid SQL Server - since parsing an arbitrary number of numbers amid all sorts of other characters is really best suited for looping or recursion, and T-SQL is not well suited for looping or recursion (though TECHNICALLY supported, All I see is 'DON'T USE LOOPS!!!' and 'CTE's are even worse!!!')
  • Some sort of comparator in SQL(??) - doesn't seem like SQL lends itself to that sort of sorting and I don't see a way to specify a comparator to use - so I guess this won't work...

我的值至少与以下内容一样变化:

I have values at least as varied as the following:

100s455t
200s400
d399487
S0000005.2
d400400
d99222
cg9876
D550-9-1
CL2009-3-27
f2g099
f2g100
f2g1000
f2g999
cg 8837
99s1000f

这些应按以下顺序排序:

These should be sorted as follows:

99s1000f
100s455t
200s400
cg9876
cg 8837
CL2009-3-27
D550-9-1
d99222
d399487
d400400
f2g099
f2g100
f2g999
f2g1000
S0000005.2

推荐答案

创建一个排序列.这样,您可以将所有常用的机制保留在今天用来排序的位置.例如,您可以为该列编制索引.

Create a sort column. That way you can keep all the usual mechanisms in place that you use today to sort. You can index that column for example.

将字符串分成几部分.您需要使用零填充数字零件,以最大可能的数字长度.

Split the string into parts. You need to pad number parts with zeroes to the maximum possible number length.

例如CL2009-3将成为CL|000002009|-|000000003.

通过这种方式,通常不区分大小写的SQL Server排序规则排序行为将创建正确的顺序.

This way the usual case-insensitive SQL Server collation sort behavior will create the right order.

动态地进行自然排序会阻止编制索引,需要针对每个查询将整个数据集移入应用程序,并且会占用大量资源.

Doing a natural sort dynamically prevents indexing, requires the entire data set to move into the app for each query and is resource intensive.

相反,只要您更新基本列,就只需更新排序列.

Instead, simply update the sort column whenever you update the base column.

这篇关于SQL Server的自然排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆