十进制不同的再presentations为Double数学解释 [英] Mathematical explanation of Decimal different representations as Double

查看:250
本文介绍了十进制不同的再presentations为Double数学解释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不知道如果陈述一个堆栈溢出问题,这种不规范的方式是好是坏,但这里有云:

I am not sure if this non-standard way of stating a Stack Overflow question is good or bad, but here goes:

什么是最好的(数学或其他技术)的解释,为什么在code:

What is the best (mathematical or otherwise technical) explanation why the code:

static void Main()
{
  decimal[] arr =
  {
    42m,
    42.0m,
    42.00m,
    42.000m,
    42.0000m,
    42.00000m,
    42.000000m,
    42.0000000m,
    42.00000000m,
    42.000000000m,
    42.0000000000m,
    42.00000000000m,
    42.000000000000m,
    42.0000000000000m,
    42.00000000000000m,
    42.000000000000000m,
    42.0000000000000000m,
    42.00000000000000000m,
    42.000000000000000000m,
    42.0000000000000000000m,
    42.00000000000000000000m,
    42.000000000000000000000m,
    42.0000000000000000000000m,
    42.00000000000000000000000m,
    42.000000000000000000000000m,
    42.0000000000000000000000000m,
    42.00000000000000000000000000m,
    42.000000000000000000000000000m,
  };

  foreach (var m in arr)
  {
    Console.WriteLine(string.Format(CultureInfo.InvariantCulture,
      "{0,-32}{1,-20:R}{2:X8}", m, (double)m, m.GetHashCode()
      ));
  }

  Console.WriteLine("Funny consequences:");
  var h1 = new HashSet<decimal>(arr);
  Console.WriteLine(h1.Count);
  var h2 = new HashSet<double>(arr.Select(m => (double)m));
  Console.WriteLine(h2.Count);
}

给出了下面的搞笑(显然是不正确的)输出:

gives the following "funny" (apparently incorrect) output:

42                              42                  40450000
42.0                            42                  40450000
42.00                           42                  40450000
42.000                          42                  40450000
42.0000                         42                  40450000
42.00000                        42                  40450000
42.000000                       42                  40450000
42.0000000                      42                  40450000
42.00000000                     42                  40450000
42.000000000                    42                  40450000
42.0000000000                   42                  40450000
42.00000000000                  42                  40450000
42.000000000000                 42                  40450000
42.0000000000000                42                  40450000
42.00000000000000               42                  40450000
42.000000000000000              42                  40450000
42.0000000000000000             42                  40450000
42.00000000000000000            42                  40450000
42.000000000000000000           42                  40450000
42.0000000000000000000          42                  40450000
42.00000000000000000000         42                  40450000
42.000000000000000000000        41.999999999999993  BFBB000F
42.0000000000000000000000       42                  40450000
42.00000000000000000000000      42.000000000000007  40450000
42.000000000000000000000000     42                  40450000
42.0000000000000000000000000    42                  40450000
42.00000000000000000000000000   42                  40450000
42.000000000000000000000000000  42                  40450000
Funny consequences:
2
3

在.NET 4.5.2试过这一点。

Tried this under .NET 4.5.2.

推荐答案

Decimal.cs ,我们可以看到, GetHash code()实现为本土code。此外,我们可以看到,中投以实现为调用 ToDouble(),这又是实现本地code。因此,从那里,我们看不到的行为合乎逻辑的解释。

In Decimal.cs, we can see that GetHashCode() is implemented as native code. Furthermore, we can see that the cast to double is implemented as a call to ToDouble(), which in turn is implemented as native code. So from there, we can't see a logical explanation for the behaviour.

在旧共享源代码CLI ,我们可以发现这些方法有希望带来了曙光,如果他们没有改变太多旧的实现。我们可以在comdecimal.cpp发现:

In the old Shared Source CLI, we can find old implementations of these methods that hopefully sheds some light, if they haven't changed too much. We can find in comdecimal.cpp:

FCIMPL1(INT32, COMDecimal::GetHashCode, DECIMAL *d)
{
    WRAPPER_CONTRACT;
    STATIC_CONTRACT_SO_TOLERANT;

    ENSURE_OLEAUT32_LOADED();

    _ASSERTE(d != NULL);
    double dbl;
    VarR8FromDec(d, &dbl);
    if (dbl == 0.0) {
        // Ensure 0 and -0 have the same hash code
        return 0;
    }
    return ((int *)&dbl)[0] ^ ((int *)&dbl)[1];
}
FCIMPLEND

FCIMPL1(double, COMDecimal::ToDouble, DECIMAL d)
{
    WRAPPER_CONTRACT;
    STATIC_CONTRACT_SO_TOLERANT;

    ENSURE_OLEAUT32_LOADED();

    double result;
    VarR8FromDec(&d, &result);
    return result;
}
FCIMPLEND

我们可以看到,在 GetHash code()的实施是基于转化为 :哈希code是基于转化为后产生的双字节。它是基于等于十进制值转换为等于值的假设。

We can see that the the GetHashCode() implementation is based on the conversion to double: the hash code is based on the bytes that result after a conversion to double. It is based on the assumption that equal decimal values convert to equal double values.

因此​​,让我们的测试 VarR8FromDec 系统调用.NET之外的:

So let's test the VarR8FromDec system call outside of .NET:

在德尔福(我实际使用FreePascal的),这里是一个简短的程序来调用系统功能直接测试自己的行为:

In Delphi (I'm actually using FreePascal), here's a short program to call the system functions directly to test their behaviour:

{$MODE Delphi}
program Test;
uses
  Windows,
  SysUtils,
  Variants;
type
  Decimal = TVarData;
function VarDecFromStr(const strIn: WideString; lcid: LCID; dwFlags: ULONG): Decimal; safecall; external 'oleaut32.dll';
function VarDecAdd(const decLeft, decRight: Decimal): Decimal; safecall; external 'oleaut32.dll';
function VarDecSub(const decLeft, decRight: Decimal): Decimal; safecall; external 'oleaut32.dll';
function VarDecDiv(const decLeft, decRight: Decimal): Decimal; safecall; external 'oleaut32.dll';
function VarBstrFromDec(const decIn: Decimal; lcid: LCID; dwFlags: ULONG): WideString; safecall; external 'oleaut32.dll';
function VarR8FromDec(const decIn: Decimal): Double; safecall; external 'oleaut32.dll';
var
  Zero, One, Ten, FortyTwo, Fraction: Decimal;
  I: Integer;
begin
  try
    Zero := VarDecFromStr('0', 0, 0);
    One := VarDecFromStr('1', 0, 0);
    Ten := VarDecFromStr('10', 0, 0);
    FortyTwo := VarDecFromStr('42', 0, 0);
    Fraction := One;
    for I := 1 to 40 do
    begin
      FortyTwo := VarDecSub(VarDecAdd(FortyTwo, Fraction), Fraction);
      Fraction := VarDecDiv(Fraction, Ten);
      Write(I: 2, ': ');
      if VarR8FromDec(FortyTwo) = 42 then WriteLn('ok') else WriteLn('not ok');
    end;
  except on E: Exception do
    WriteLn(E.Message);
  end;
end.

请注意,由于Delphi和FreePascal的对任何浮点十进制类型没有语言的支持,我打电话系统函数来执行计算。我设置 FortyTwo 第一个 42 。我再加入 1 和减去 1 。我再加入 0.1 和减去 0.1 。等等。这将导致小数precision将扩大在.NET中以同样的方式。

Note that since Delphi and FreePascal have no language support for any floating-point decimal type, I'm calling system functions to perform the calculations. I'm setting FortyTwo first to 42. I then add 1 and subtract 1. I then add 0.1 and subtract 0.1. Et cetera. This causes the precision of the decimal to be extended the same way in .NET.

和这里的(部分)的输出:

And here's (part of) the output:


...
20: ok
21: ok
22: not ok
23: ok
24: not ok
25: ok
26: ok
...

因此​​可见这确实是Windows中一个长期存在的问题,仅仅发生在由.NET曝光。这是系统的功能,是给不同的结果等于十进制值,或者他们应该是固定的,或.NET应改为不使用有缺陷的功能。

Thus showing that this is indeed a long-standing problem in Windows that merely happens to be exposed by .NET. It's system functions that are giving different results for equal decimal values, and either they should be fixed, or .NET should be changed to not use defective functions.

现在,在新的.NET的核心,我们可以在它的 decimal.cpp code,以解决此问题:

Now, in the new .NET Core, we can see in its decimal.cpp code to work around the problem:

FCIMPL1(INT32, COMDecimal::GetHashCode, DECIMAL *d)
{
    FCALL_CONTRACT;

    ENSURE_OLEAUT32_LOADED();

    _ASSERTE(d != NULL);
    double dbl;
    VarR8FromDec(d, &dbl);
    if (dbl == 0.0) {
        // Ensure 0 and -0 have the same hash code
        return 0;
    }
    // conversion to double is lossy and produces rounding errors so we mask off the lowest 4 bits
    // 
    // For example these two numerically equal decimals with different internal representations produce
    // slightly different results when converted to double:
    //
    // decimal a = new decimal(new int[] { 0x76969696, 0x2fdd49fa, 0x409783ff, 0x00160000 });
    //                     => (decimal)1999021.176470588235294117647000000000 => (double)1999021.176470588
    // decimal b = new decimal(new int[] { 0x3f0f0f0f, 0x1e62edcc, 0x06758d33, 0x00150000 }); 
    //                     => (decimal)1999021.176470588235294117647000000000 => (double)1999021.1764705882
    //
    return ((((int *)&dbl)[0]) & 0xFFFFFFF0) ^ ((int *)&dbl)[1];
}
FCIMPLEND

这似乎是在目前的.NET框架实现过,基于这样的事实是错误的一间双人值也给出相同的散列code,但是这还不足以彻底解决这一问题。

This appears to be implemented in the current .NET Framework too, based on the fact that one of the wrong double values does give the same hash code, but it's not enough to completely fix the problem.

这篇关于十进制不同的再presentations为Double数学解释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆