Ruby的max函数顺序如何重复? [英] How does Ruby's max function order duplicates?

查看:81
本文介绍了Ruby的max函数顺序如何重复?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在研究 max方法在Ruby的Enumerable mixin(v2.4.1)中.

I've been looking at the max method in Ruby's Enumerable mixin (v2.4.1).

这是一种非常简单的方法,但是当重复项出现时如何排序项目却有点令人困惑.

It's a fairly straightforward method, but how it orders items when duplicates are present is a bit confusing.

例如:

x = [1,2,3,4,5,6,7,8,9]
x.max {|a, b| a%2 <=> b%2}
=> 1
10.times{|y| p x.max(y) {|a, b| a%2 <=> b%2}}
[]
[1]
[1, 7] # why is 7 the next element after 1?
[3, 1, 5] # why no more 7?
[7, 3, 1, 5] # 7 is now first
[9, 7, 3, 1, 5]
[9, 7, 3, 1, 5, 6]
[9, 7, 3, 1, 5, 4, 6]
[9, 7, 3, 1, 5, 2, 4, 6]
[9, 7, 5, 3, 1, 8, 6, 4, 2] # order has changed again (now seems more "natural")

如何选择7作为第二项?为什么当取三个值时根本不选择它?

How is 7 chosen as the second item? Why is it not chosen at all when three values are taken?

如果您使用更多的数字,则排序是不一致的(尽管集合中的项目是).

If you take even more numbers, the ordering is not consistent (though the items in the set are).

我浏览了源代码,但是似乎在进行正常的比较;从该代码中看不出此处看到的顺序.

I have taken a glance at the source code, but it seems to be doing normal comparison; the ordering seen here isn't evident from that code.

任何人都可以解释如何实现此顺序吗?我知道上面的命令都是有效的",但是它们是如何生成的?

Can anyone explain how this ordering is achieved? I know that the orderings above are all "valid", but how are they generated?

推荐答案

可以使用max_by来简化示例,以产生相似的结果:

Your example can be simplified by using max_by to yield the similar result:

10.times{|y| p x.max_by(y) {|t| t%2}}

我花了一些时间来寻找消息来源,但找不到任何漏洞.

I have spent some time with the source but can't find any hole.

我记得看到一个名为

After I remember seeing a publication called Switch: A Deep Embedding of Queries into Ruby (dissertation by Manuel Mayr) I found the answer.

在第104页的哪里,您可以找到max_by的答案:

Where on page 104 you can find the answer for max_by:

...在此,输入列表中的值在以下情况下假定为最大值 函数返回的值返回.如果产生一个以上的值 在最大值中,这些值之间的结果选择是任意的. ...

... Here, the value in the input list that assumes the maximum when evaluated by the function is returned. If more than one value yields the maximum, the choice for a result among these values is arbitrary. ...


同样适用于:
sort&从评论中sort_by @
emu.c


Similarly for:
sort & sort_by from the comments @ emu.c

不能保证结果稳定.当两个键相等时 相应元素的顺序是不可预测的.

The result is not guaranteed to be stable. When two keys are equal, the order of the corresponding elements is unpredictable.

第一,第二编辑-我们需要更深入" =>我希望您会喜欢骑行".

First, Second Edit - "we need to go deeper" => I hope you will enjoy the "ride".


简短答案:
排序看起来像是这样做的原因是max_by块的组合(导致从%2中的max值开始排序,而1然后是0,然后继续0)和qsort_r(BSD快速排序)实现@ruby.


The short answer:
The reason for the sorting looking like it does is combination of max_by block(causes to start sorting with max values from %2 which is 1 then it continues with 0) and qsort_r (BSD quick-sort) implemented @ruby.



长答案: 全部基于ruby 2.4.2或当前2.5.0(目前正在开发)的源代码.



The long answer: All is based on the source code of ruby 2.4.2 or currently 2.5.0 (which is being developed right now).

根据您使用的编译器,快速排序算法可能会有所不同.您可以使用qsort_r:GNU版本,BSD版本(您可以检查配置. ac ). Visual Studio正在使用2012年或更高版本的BSD版本.

The quick-sort algorithm can differ based on the compiler you are using. You can be using qsort_r: GNU version, BSD version (you can check the configure.ac) for more. The visual studio is using BSD version from 2012 or later.

+Tue Sep 15 12:44:32 2015  Nobuyoshi Nakada  <nobu@ruby-lang.org>
+
+   * util.c (ruby_qsort): use BSD-style qsort_r if available.


Thu May 12 00:18:19 2016  NAKAMURA Usaku  <usa@ruby-lang.org>

    * win32/Makefile.sub (HAVE_QSORT_S): use qsort_s only for Visual Studio
      2012 or later, because VS2010 seems to causes a SEGV in
test/ruby/test_enum.rb.

  1. 如果您具有GNU qsort_r但没有BSD: 仅使用内部ruby_qsort实现.检查 util.c 以了解内部快速排序(ruby_qsort(void* base, const size_t nel, const size_t size, cmpfunc_t *cmp, void *d))由村村智之(Tomoyuki Kawamura)提供的功能. @util.h

  1. If you have GNU qsort_r but not BSD: Only internal ruby_qsort implementation is used. Check util.c for internal implementation of the quick-sort (ruby_qsort(void* base, const size_t nel, const size_t size, cmpfunc_t *cmp, void *d)) function by Tomoyuki Kawamura. @util.h

如果HAVE_GNU_QSORT_R = 1,则#define ruby_qsort qsort_r:

If HAVE_GNU_QSORT_R=1 then #define ruby_qsort qsort_r:

#ifdef HAVE_GNU_QSORT_R
#define ruby_qsort qsort_r
#else    void ruby_qsort(void *, const size_t, const size_t,
    int (*)(const void *, const void *, void *), void *);
#endif

  • 如果检测到BSD样式: 然后使用下面的代码(可以在 util.c 中找到).注意在ruby_qsort之前如何调用cmp_bsd_qsort.原因?可能是标准化,堆栈空间甚至是速度(我自己没有测试它-必须创建基准测试,这很耗时).

  • If the BSD-style is detected: Then the below code is used (can be found at util.c). Notice how the cmp_bsd_qsort is called before ruby_qsort. The reason? Probably standardization, stack space and maybe speed (did not test it myself - would have to create benchmark, and that is quite time consuming).

    节省的堆栈空间在BSD qsort.c源代码中指示:

    The saving stack space is indicated in the BSD qsort.c source code:

        /*
        * To save stack space we sort the smaller side of the partition first
        * using recursion and eliminate tail recursion for the larger side.
        */
    

    ruby​​源代码处的BSD分支:

    The BSD branch at the ruby source code:

         #if defined HAVE_BSD_QSORT_R
        typedef int (cmpfunc_t)(const void*, const void*, void*);
    
        struct bsd_qsort_r_args {
            cmpfunc_t *cmp;
            void *arg;
        };
    
        static int
        cmp_bsd_qsort(void *d, const void *a, const void *b)
        {
            const struct bsd_qsort_r_args *args = d;
            return (*args->cmp)(a, b, args->arg);
        }
    
        void
        ruby_qsort(void* base, const size_t nel, const size_t size, cmpfunc_t *cmp, void *d)
        {
            struct bsd_qsort_r_args args;
            args.cmp = cmp;
            args.arg = d;
            qsort_r(base, nel, size, &args, cmp_bsd_qsort);
        }
    

    如果您使用MSYS2在Windows上编译红宝石(没有更多的DevKit,而是用于Windows安装程序的MSYS2,我大部分时间都在使用)NetBSD版本的qsort_r(从02-07-2012开始).最新的NetBSD

    If you are using MSYS2 to compile your ruby on Windows (no more DevKit but MSYS2 for windows installer, which I'm using most of the time ) NetBSD version of qsort_r (which is there from 02-07-2012). The latest NetBSD qsort.c (revision:1.23).

    现在是真实示例-我们需要更深入地研究"

    Now for the real-life examples - "we need to go even deeper"

    测试将在两个(Windows)红宝石上进行:

    The tests will be performed on two (windows) rubies:

    • 第一个ruby:将基于DevKit版本2.2.2p95(已于2015年4月13日发布),并且不包含BSD qsort实现.

    • First ruby: will be based on DevKit version 2.2.2p95 (which was released on 13 Apr 2015) and does not contain the BSD qsort implementation.

    第二个ruby:将基于MSYS2 tool-chain版本的ruby 2.4.2-p198(已于2017年9月15日发布),并且包含用于BSD qsort的补丁(请参见上文).

    Second ruby: will be based on MSYS2 tool-chain version ruby 2.4.2-p198 (which was released on 15th Sep 2017) and does contain patch (see above) for BSD qsort implementation.

    代码:

    x=[1,2,3,4,5,6,7,8,9]
    10.times{|y| p x.max_by(y) {|t| t%2}}
    

    Ruby 2.2.2p95:

    Ruby 2.2.2p95:

    The result:
    []
    [5]
    [7, 1]
    [3, 1, 5]
    [7, 3, 1, 5]
    [9, 7, 3, 1, 5]
    [5, 9, 1, 3, 7, 6]
    [5, 1, 9, 3, 7, 6, 4]
    [5, 1, 3, 7, 9, 6, 4, 2]
    [9, 1, 7, 3, 5, 4, 6, 8, 2]
    

    Ruby 2.4.2-p198:

    Ruby 2.4.2-p198:

    The result:
    []
    [1]
    [7, 1]
    [5, 3, 1]
    [5, 7, 3, 1]
    [5, 9, 7, 3, 1]
    [5, 1, 9, 7, 3, 6]
    [5, 1, 3, 9, 7, 4, 6]
    [5, 1, 3, 7, 9, 2, 6, 4]
    [9, 1, 3, 7, 5, 8, 4, 6, 2]
    

    现在用于不同的x: x=[7,9,3,4,2,6,1,8,5]

    Ruby 2.2.2p95:

    Ruby 2.2.2p95:

    The result:
    []
    [1]
    [9, 7]
    [1, 7, 3]
    [5, 1, 7, 3]
    [5, 1, 3, 9, 7]
    [7, 5, 9, 3, 1, 2]
    [7, 9, 5, 3, 1, 2, 4]
    [7, 9, 3, 1, 5, 2, 4, 8]
    [5, 9, 1, 3, 7, 4, 6, 8, 2]
    

    Ruby 2.4.2-p198:

    Ruby 2.4.2-p198:

    The result:
    []
    [9]
    [9, 7]
    [3, 1, 7]
    [3, 5, 1, 7]
    [7, 5, 1, 3, 9]
    [7, 9, 5, 1, 3, 2]
    [7, 9, 3, 5, 1, 4, 2]
    [7, 9, 3, 1, 5, 8, 2, 4]
    [5, 9, 3, 1, 7, 2, 4, 6, 8]
    

    现在对于源数组中的相同项(qsort不稳定,请参见下文): x=[1, 1, 1, 2, 3, 4, 5, 6, 7, 8, 9]

    Now for the same items in the source array (qsort is unstable see below): x=[1, 1, 1, 2, 3, 4, 5, 6, 7, 8, 9]

    使用以下代码对其进行处理: 12.times{|y| p x.max_by(y) {|t| t%2}}

    Process it with the following code: 12.times{|y| p x.max_by(y) {|t| t%2}}

    Ruby 2.2.2p95:

    Ruby 2.2.2p95:

    The result:
    []
    [3]
    [1, 1]
    [9, 1, 7]
    [3, 9, 1, 7]
    [5, 3, 9, 1, 7]
    [1, 5, 3, 9, 1, 7]
    [5, 9, 3, 7, 1, 1, 1]
    [1, 5, 9, 1, 7, 1, 3, 4]
    [1, 1, 5, 9, 1, 7, 3, 4, 2]
    [1, 1, 1, 5, 7, 3, 9, 4, 2, 8]
    [9, 1, 7, 1, 5, 3, 1, 2, 6, 8, 4]
    

    Ruby 2.4.2-p198:

    Ruby 2.4.2-p198:

    The Result:
    []
    [1]
    [1, 1]
    [7, 9, 1]
    [7, 3, 9, 1]
    [7, 5, 3, 9, 1]
    [7, 1, 5, 3, 9, 1]
    [1, 5, 9, 3, 7, 1, 1]
    [1, 1, 5, 9, 3, 7, 1, 4]
    [1, 1, 1, 5, 9, 3, 7, 2, 4]
    [1, 7, 3, 1, 5, 9, 1, 2, 4, 8]
    [9, 3, 1, 7, 1, 5, 1, 2, 8, 6, 4]
    

    现在要提一个大问题->现在为什么结果不同?

    Now for the big question --> Now why are the results different?

    第一个明显的答案是,当使用GNU或BSD实现时,结果会有所不同吗?正确的?很好的实现是不同的,但是产生的结果相同(请检查链接的实现以获取详细信息).问题的核心在其他地方.

    The first obvious answer would be that when using GNU or BSD implementations the result would be different? Right? Well the implementations are different, but yielding (check the linked implementations for details) same results. The core of the issue is elsewhere.

    真正的问题在于算法本身.当使用快速排序时,您得到的是不稳定排序(当您比较两个相等的值时,它们的顺序不会保持相同).当您拥有了[1,2,3,4,5,6,7,8,9]之后,您就可以将其转换为[1,0,1,0,1,0,1,0,1]使用max(_by),您正在将数组排序为[1,1,1,1,1,0,0,0,0,0].您从1开始,但是哪个?好吧,那么您将得到不可预测的结果. (max(_by)是为什么先获得奇数,然后获得偶数的原因.)

    The real issue here the algorithm itself. When quick-sort is used then what you get is unstable sorting (when you are comparing the two equal values their order does not remain the same). When you have your [1,2,3,4,5,6,7,8,9] which you then transform in the block to [1,0,1,0,1,0,1,0,1] with max(_by) you are sorting the array into [1,1,1,1,1,0,0,0,0]. You start with 1, but which one? Well then you get unpredictable result. (max(_by) is the reason why are getting the odd numbers first and then the even ones).

    请参见 GNU qsort 注释:

    警告:如果两个对象比较相等,则排序后的顺序为 不可预测的.也就是说,排序不稳定.这个可以 当比较仅考虑部分 元素.具有相同排序键的两个元素可能在其他元素上有所不同 尊重.

    Warning: If two objects compare as equal, their order after sorting is unpredictable. That is to say, the sorting is not stable. This can make a difference when the comparison considers only part of the elements. Two elements with the same sort key may differ in other respects.

    现在像引擎一样对其进行排序:

    Now to sort it like the engine does:

    [1,2,3,4,5,6,7,8,9]->首先考虑的是奇数[1,3,5,7,9],认为与max_by{|t| t%2}相等的奇数产生[1,1,1,1,1].

    [1,2,3,4,5,6,7,8,9] -> The first that are taken in account are the odd numbers [1,3,5,7,9] and those are considered equal with max_by{|t| t%2} produces [1,1,1,1,1].

    结论:

    现在要服用哪一个?好吧,在您的情况下,这是您无法预测的.即使使用与基础 quick-sort 算法相同的红宝石版本,我也会得到不同的结果本质上是不稳定的.

    Now which one to take? Well it is unpredictable in your case it was the ones you got. I'll get different ones even for the same ruby version as the underlying quick-sort algorithm is unstable by its nature.

    这篇关于Ruby的max函数顺序如何重复?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆