如何建立与PostgreSQL相同的对字符串进行排序的比较器? [英] How can I build a comparator that sorts Strings the same way that PostgreSQL does?

查看:170
本文介绍了如何建立与PostgreSQL相同的对字符串进行排序的比较器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个集成测试,它将复杂的订单通过传递到PostgreSQL,然后检查数据是否以正确的顺序返回。我正在用Java编写此集成测试,其 String.compareTo 方法似乎在排序方面与PostgreSQL不同。我在PostgreSQL数据库上运行了此代码:

I'm writing an integration test that is passing a complex order by to PostgreSQL and then checking that the data comes back in the correct order. I'm writing this integration test in Java and its String.compareTo method appears to sort things differently from PostgreSQLs. I ran this on my PostgreSQL database:

SELECT regexp_split_to_table('D d a A c b', ' ') ORDER BY 1;

它回答如下:

a
A
b
c
d
D

然后我创建了此单元测试,以将其与Java排序方式进行比较:

I then created this unit test to compare that to the way Java sorts things:

import com.google.common.collect.Lists;
import com.google.common.collect.Ordering;
import org.junit.Test;

import java.util.List;

import static junit.framework.Assert.assertEquals;

public class PostgresqlSortOrderTest {

    @Test
    public void whenJavaSortsStringsThenItIsTheSameAsWhenPostgresqlSortsStrings() {
        List<String> postgresqlOrder = Lists.newArrayList("a", "A", "b", "c", "d", "D");
        Ordering<String> ordering = new Ordering<String>() {
            @Override
            public int compare(String left, String right) {

                return left.compareTo(right);
            }
        };
        List<String> javaOrdering = ordering.sortedCopy(postgresqlOrder);
        assertEquals(postgresqlOrder, javaOrdering);
    }

}

输出失败: / p>

This failed with this output:

Expected :[a, A, b, c, d, D]  //postgresql
Actual   :[A, D, a, b, c, d]  //java

我对术语一无所知这里。我想知道这些不同的String类型的名称,以便更好地交流。但更重要的是,如何使Java像PostgreSQL一样排序?

I'm very ignorant of the terminology here. I'd like to know the names of these different String sorts so I can communicate better. But more importantly, how can I make Java sort like PostgreSQL does?

推荐答案

后来显示了答案,但是我担心简单的不区分大小写的搜索不一定会做您想要的事情。

Late to show with an answer, but I'm afraid a simple case insensitive search isn't necessarily going to do what you want.

您想要在搜索中使用的关键字是排序规则(从广义上讲语言环境)和PostgreSQL依靠底层操作系统来提供对此的支持。排序很少是简单的逐字符比较。例如,在许多语言环境中,空格都会被忽略(en_GB中肯定是这种情况。)

The keyword you want in your searches is collation (and in a wider sense locales) and PostgreSQL relies on the underlying operating-system to provide support for this. The ordering is rarely a simple character-by-character comparison. For example, in many locales spaces are ignored (that's certainly the case in en_GB).

此外,这意味着您可以在不同平台上获得不同的排序顺序(取决于关于苹果还是微软就您所在国家/地区的默认订购是否同意Linus)。

Also, this means you can end up with different sort orders on different platforms (depending on whether Apple or Microsoft agree with Linus as to the default ordering for your country).

关于是否包括BSD是否有意义,已有一些讨论。许可库,以提供跨平台一致的订购集。但是,这需要进行大量工作,这意味着您可以在数据库内部进行与其他操作系统不同的排序。尽管不同的提供程序在如何处理方面存在分歧,但恐怕没有一个简单的解决方案。

There has been some discussion as to whether it would make sense to include a BSD-licenced library to provide a consistent set of orderings across platforms. However, this is a lot of work and then means you can end up with different sorting inside your database from the rest of your operating-system. While different providers disagree on how to handle this, there's no one simple solution I'm afraid.

您可能想研究 C排序规则以进行传统排序。恐怕我无法评论Java对正确的语言环境排序的处理方式,而不是我的领域。

You might want to investigate the "C" collation for "traditional" sorting. I'm afraid I can't comment on Java's handling of proper locale sorting - not my field.

这篇关于如何建立与PostgreSQL相同的对字符串进行排序的比较器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆