在有限的16字节字符串写入IEEE 754-1985双为ASCII [英] Writing IEEE 754-1985 double as ASCII on a limited 16 bytes string

查看:194
本文介绍了在有限的16字节字符串写入IEEE 754-1985双为ASCII的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个后续行动,我的<一个href=\"http://stackoverflow.com/questions/32629084/best-ieee-754-1985-re$p$psentation-for-x3-9-1978-based-standard\">original帖子。但我会重复的净度:

根据DICOM标准,一种类型的浮点可以使用十进制字符串值重新presentation存储。见<一href=\"http://stackoverflow.com/questions/32629084/best-ieee-754-1985-re$p$psentation-for-x3-9-1978-based-standard\">Table 6.2-1。 DICOM转口货值为presentations 的:


  

十进制字符串:字符的字符串重新presenting一个固定
  点数目或浮点数。定点数应
  仅包含0-9具有可选领先的+或 -
  和一个可选的。以纪念小数点。浮点数
  如ANSI X3.9定义,与一个E或e进行应传送
  指示指数的开始。十进制可与填充
  前导或尾随空格。嵌入式不允许有空格。


  
  

0 - 9,+, - ,E,E,。和默认的空格字符
  字符集。 16个字节的最大


该标准说的的文本再presentation 的是定点与浮点。该标准只是指价值观如何重新在DICOM数据集中本身psented $ P $。作为这样没有要求一个固定点文本重新presentation加载到定点变量

所以,现在,这是明确表示,DICOM标准隐含推荐双击(IEEE 754-1985)再presenting一个转口货值为presentation 类型十进制字符串(最多16个显著位)。我的问题是我怎么使用标准C I / O库从内存转换回这个二进制重新presentation成ASCII到这个有限的大小字符串?

从互联网上随机源,这是不平凡的,但一般的接受的解决方案可以是:

 的printf(%1.16e \\ n,D); //圆trippable与指数双,总是

 的printf(%17克\\ n,D); //圆trippable双,最短的

当然,前两者pression在我的情况下,无效的,因为他们可以生产多少产量超过我有限的 16个字节的最大。那么,什么是解决尽量减少precision损失任意双重价值写出在有限的16字节字符串时?


修改:如果这是不明确的,我需要遵循的标准。我不能使用十六进制/ uuen code编码。

编辑2 :我用特拉维斯-CI请参阅运行比较:这里

到目前为止建议codeS是:


  1. 贝勒斯特塞尔

  2. chux

  3. 马克狄金森

  4. chux

结果我在这里看到的是:


  • compute1.c 导致的总和错误: 0.0095729050923877828

  • compute2.c 导致的总和错误: 0.21764383725715469

  • compute3.c 导致的总和错误: 4.050031792674619

  • compute4.c 导致的总和错误: 0.001287056579548422

所以 compute4.c 导致最佳的precision(0.001287056579548422&LT; 4.050031792674619),但三(X3)的总执行时间(仅在调试测试模式下使用时间命令)。


解决方案

这是不是首先想到的麻烦。

由于各个角落的情况下,似乎最好尽量在高位precision,然后根据需要工作了。


  1. 任何负数打印一样用小1 precision一个正数,由于 -


  2. +注册并不需要在字符串的开头,也没有在'E'


  3. 没有必要的。


  4. 危险使用比以外的任何的sprintf()做给了这么的许多的角落案件的数学部分。由于各种舍入模式, FLT_EVAL_METHOD 等,留下沉重的编码,以完善的功能。


  5. 当试图超过1个字符太长,迭代可以保存。例如。如果尝试,用precision 14,结果为20的宽度,没有必要去尝试precision 13和12,只是去11。


  6. 由于()取消的,必须做到在的sprintf指数的缩放 1)避免注射计算错误2)递减双击来的最小的指数之下。


  7. 最大相对误差小于1份在20亿与 -1.00000000049999e-200 。关于500亿1份的平均相对误差。


  8. 14位precision,最高的,像 12345678901234e1 号发生所以用16-2数字开始。



 静态为size_t收缩(字符* fp_buffer){
  INT铅,世博会;
  长长的尾数;
  INT N0,N1;
  INT N =的sscanf(fp_buffer,%D%N%LLD%NE%d个,&安培;铅,&安培; N0,和放大器;尾数,和放大器; N1,&安培;世博);
  断言(N = = 3);
  返回的sprintf(fp_buffer,%D%0 * llde%D,铅,N1 - N0,尾数,
          世博会 - (N1 - N0));
}INT x16printf(字符* DEST,为size_t宽,双值){
  (!ISFINITE(值))如果返回1;  如果(宽度小于5)返回2;
  如果(signbit(值)){
    值= - 值;
    的strcpy(目标++ - );
    宽度 - ;
  }
  INT precision =宽度 - 2;
  而(precision大于0){
    字符缓冲区[宽度+ 10];
    //%。* E打印1位,'。'然后`precision - 1`位数
    的snprintf(缓冲区,缓冲区的sizeof,%* E,precision - 1,价值);
    为size_t N =收缩(缓冲);
    如果(N&LT; =宽度){
      的strcpy(DEST,缓冲区);
      返回0;
    }
    如果(N&GT;宽度+ 1)precision - = N - 宽 - 1;
    否则precision--;
  }
  返回3;
}

测试code

 双rand_double(无效){
  工会{
    双D;
    unsigned char型UC [的sizeof(双)];
  } U;
  做{
    用于(为size_t我= 0; I&LT;的sizeof(双);我++){
      u.uc [I] = RAND();
    }
  }而(ISFINITE(u.d)!);
  返回u.d;
}无效x16printf_test(double值){
  的printf(% - 27 * e的,17日,值);
  炭的buf [16 + 1];
  BUF [0] = 0;
  INT Y = x16printf(BUF,sizeof的BUF - 1,价值);
  的printf(%d个\\ N,Y);
  的printf(%s'的\\ n,BUF);
}
诠释主要(无效){
  的for(int i = 0;我小于10;我++)
    x16printf_test(rand_double());
}

输出

  -1.55736829786841915e + 118 0
-15573682979e108
-3.06117209691283956e + 125 0
-30611720969e115
8.05005611774356367e + 175 0
805005611774e164
-1.06083057094522472e + 132 0
-10608305709e122
3.39265065244054607e-209 0
33926506524e-219
-2.36818580315246204e-244 0
-2368185803e-253
7.91188576978592497e + 301 0
791188576979e290
-1.40513111051994779e-53 0
-14051311105e-63'
-1.37897140950449389e-14 0
-13789714095e-24'
-2.15869805640288206e + 125 0
-21586980564e115

This is a follow-up to my original post. But I'll repeat it for clarity:

As per DICOM standard, a type of floating point can be stored using a Value Representation of Decimal String. See Table 6.2-1. DICOM Value Representations:

Decimal String: A string of characters representing either a fixed point number or a floating point number. A fixed point number shall contain only the characters 0-9 with an optional leading "+" or "-" and an optional "." to mark the decimal point. A floating point number shall be conveyed as defined in ANSI X3.9, with an "E" or "e" to indicate the start of the exponent. Decimal Strings may be padded with leading or trailing spaces. Embedded spaces are not allowed.

"0"-"9", "+", "-", "E", "e", "." and the SPACE character of Default Character Repertoire. 16 bytes maximum

The standard is saying that the textual representation is fixed point vs. floating point. The standard only refers to how the values are represented within in the DICOM data set itself. As such there is not requirement to load a fixed point textual representation into a fixed-point variable.

So now that this is clear that DICOM standard implicitely recommend double (IEEE 754-1985) for representing a Value Representation of type Decimal String (maximum of 16 significant digits). My question is how do I use the standard C I/O library to convert back this binary representation from memory into ASCII onto this limited sized string ?

From random source on internet, this is non-trivial, but a generally accepted solution is either:

printf("%1.16e\n", d); // Round-trippable double, always with an exponent

or

printf("%.17g\n", d); // Round-trippable double, shortest possible

Of course both expression are invalid in my case since they can produce output much longer than my limited maximum of 16 bytes. So what is the solution to minimizing the loss in precision when writing out an arbitrary double value to a limited 16 bytes string ?


Edit: if this is not clear, I am required to follow the standard. I cannot use hex/uuencode encoding.

Edit 2: I am running the comparison using travis-ci see: here

So far the suggested codes are:

  1. Serge Ballesta
  2. chux
  3. Mark Dickinson
  4. chux

Results I see over here are:

  • compute1.c leads to a total sum error of: 0.0095729050923877828
  • compute2.c leads to a total sum error of: 0.21764383725715469
  • compute3.c leads to a total sum error of: 4.050031792674619
  • compute4.c leads to a total sum error of: 0.001287056579548422

So compute4.c leads to the best possible precision (0.001287056579548422 < 4.050031792674619), but triple (x3) the overall execution time (only tested in debug mode using time command).

解决方案

It is trickier than first thought.

Given the various corner cases, it seems best to try at a high precision and then work down as needed.

  1. Any negative number prints the same as a positive number with 1 less precision due to the '-'.

  2. '+' sign not needed at the beginning of the string nor after the 'e'.

  3. '.' not needed.

  4. Dangerous to use anything other than sprintf() to do the mathematical part given so many corner cases. Given various rounding modes, FLT_EVAL_METHOD, etc., leave the heavy coding to well established functions.

  5. When an attempt is too long by more than 1 character, iterations can be saved. E.g. If an attempt, with precision 14, resulted with a width of 20, no need to try precision 13 and 12, just go to 11.

  6. Scaling of the exponent due to the removal of the '.', must be done after sprintf() to 1) avoid injecting computational error 2) decrementing a double to below its minimum exponent.

  7. Maximum relative error is less than 1 part in 2,000,000,000 as with -1.00000000049999e-200. Average relative error about 1 part in 50,000,000,000.

  8. 14 digit precision, the highest, occurs with numbers like 12345678901234e1 so start with 16-2 digits.


static size_t shrink(char *fp_buffer) {
  int lead, expo;
  long long mant;
  int n0, n1;
  int n = sscanf(fp_buffer, "%d.%n%lld%ne%d", &lead, &n0, &mant, &n1, &expo);
  assert(n == 3);
  return sprintf(fp_buffer, "%d%0*llde%d", lead, n1 - n0, mant,
          expo - (n1 - n0));
}

int x16printf(char *dest, size_t width, double value) {
  if (!isfinite(value)) return 1;

  if (width < 5) return 2;
  if (signbit(value)) {
    value = -value;
    strcpy(dest++, "-");
    width--;
  }
  int precision = width - 2;
  while (precision > 0) {
    char buffer[width + 10];
    // %.*e prints 1 digit, '.' and then `precision - 1` digits
    snprintf(buffer, sizeof buffer, "%.*e", precision - 1, value);
    size_t n = shrink(buffer);
    if (n <= width) {
      strcpy(dest, buffer);
      return 0;
    }
    if (n > width + 1) precision -= n - width - 1;
    else precision--;
  }
  return 3;
}

Test code

double rand_double(void) {
  union {
    double d;
    unsigned char uc[sizeof(double)];
  } u;
  do {
    for (size_t i = 0; i < sizeof(double); i++) {
      u.uc[i] = rand();
    }
  } while (!isfinite(u.d));
  return u.d;
}

void x16printf_test(double value) {
  printf("%-27.*e", 17, value);
  char buf[16+1];
  buf[0] = 0;
  int y = x16printf(buf, sizeof buf - 1, value);
  printf(" %d\n", y);
  printf("'%s'\n", buf);
}


int main(void) {
  for (int i = 0; i < 10; i++)
    x16printf_test(rand_double());
}

Output

-1.55736829786841915e+118   0
'-15573682979e108'
-3.06117209691283956e+125   0
'-30611720969e115'
8.05005611774356367e+175    0
'805005611774e164'
-1.06083057094522472e+132   0
'-10608305709e122'
3.39265065244054607e-209    0
'33926506524e-219'
-2.36818580315246204e-244   0
'-2368185803e-253'
7.91188576978592497e+301    0
'791188576979e290'
-1.40513111051994779e-53    0
'-14051311105e-63'
-1.37897140950449389e-14    0
'-13789714095e-24'
-2.15869805640288206e+125   0
'-21586980564e115'

这篇关于在有限的16字节字符串写入IEEE 754-1985双为ASCII的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆