如何用汇编语言输入和输出实数 [英] How to input and output real numbers in assembly language

查看:278
本文介绍了如何用汇编语言输入和输出实数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用FPU解决汇编语言中的实数问题。通常我们使用C语言或就绪函数编写输入和输出代码。例如:

We solve problems with real numbers in assembly language using FPU. Usually we write input and output code using C language or ready functions.For example:

    ; Receiving input and output descriptors for the console
    invoke  GetStdHandle,   STD_INPUT_HANDLE
    mov     hConsoleInput,  eax

    invoke  GetStdHandle,   STD_OUTPUT_HANDLE
    mov     hConsoleOutput, eax

    invoke  ClearScreen
    ;input X
    invoke  WriteConsole, hConsoleOutput, ADDR aszPromptX,\
            LENGTHOF aszPromptX - 1, ADDR BufLen, NULL
    invoke  ReadConsole, hConsoleInput, ADDR Buffer,\
            LENGTHOF Buffer, ADDR BufLen, NULL
    finit
    invoke  StrToFloat, ADDR Buffer, ADDR X

如何在不使用现成函数的情况下用汇编语言输入和输出实数?

How to do input and output of real numbers in assembly language without using ready functions?

推荐答案

这与如何实现这些功能/它们如何在幕后工作的问题完全相同。我只想谈谈这个答案中的输入;我不确定哪种算法适用于float-> string。

This is really the same question as how to implement those functions / how they work under the hood. I'm just going to talk about input in this answer; I'm not sure what algorithms are good for float->string.

OS提供的函数允许您一次一个地读取/写入(打印)字符块。该问题的有趣/ FP特定部分仅是float-> string和string-> float部分。其他所有内容都与读取/打印整数相同(模数调用约定差异:浮点数通常在FP寄存器中返回)。

OS-provided functions let you read / write (print) characters, one at a time or in blocks. The interesting / FP-specific part of the problem is only the float->string and string->float part. Everything else is the same as for reading/printing integers (modulo calling-convention differences: floats are usually returned in FP registers).

正确实施 strtod (字符串到双)和单精度等价物是非常重要的,如果你希望结果总是正确舍入到最接近的可表示的FP值,特别是如果你想要它也有效,并且工作对于输入,直到 double 可以持有的最大有限值的限制。

Correctly implementing strtod (string to double) and the single-precision equivalent is highly non-trivial if you want the result to always be correctly rounded to the nearest representable FP value, especially if you want it to also be efficient, and work for inputs right up to the limits of the biggest finite values that double can hold.

一旦你知道细节算法(在查看单个数字并进行FP乘法/除法/加法或FP位模式上的整数运算方面),您可以在asm中为您喜欢的任何平台实现它。您出于某种原因在示例中使用了x87 finit 指令。

Once you know the details of the algorithm (in terms of looking at single digits and doing FP multiplies / divides / additions, or integer operations on the FP bit-pattern), you can of implement it in asm for any platform you like. You used an x87 finit instruction in your example for some reason.

参见 http://www.exploringbinary.com/how-glibc-strtod-works/ 详细了解glibc的实现,并 http://www.exploringbinary。 com / how-strtod-works-and-some-doesnt / 用于另一种广泛使用的实施方案。

See http://www.exploringbinary.com/how-glibc-strtod-works/ for a detailed look at glibc's implementation, and http://www.exploringbinary.com/how-strtod-works-and-sometimes-doesnt/ for another widely-used implmentation.

概述第一篇文章,glibc's strtod 使用扩展精度整数算法。它解析输入的十进制字符串以确定整数部分和小数部分。例如 456.833e2 (科学记数法)的整数部分 45683 和小数部分 0.3

Outlining the first article, glibc's strtod uses extended-precision integer arithmetic. It parses the input decimal string to determine the integer part and the fractional part. e.g. 456.833e2 (scientific notation) has an integer part of 45683 and a fractional part 0.3.

它将两个部分分别转换为浮点数。整数部分很简单,因为已经有硬件支持将整数转换为浮点数。例如x87 fild 或SSE2 cvtsi2sd ,或其他架构上的任何其他内容。但是如果整数部分大于最大64位整数,那就不那么简单,你需要将BigInteger转换为float / double,硬件不支持。

It converts both parts to floating point separately. The integer part is easy, because there's already hardware support for converting integers to floating point. e.g. x87 fild or SSE2 cvtsi2sd, or whatever else on other architectures. But if the integer part is larger than the maximum 64-bit integer, it's not that simple, and you need to convert BigInteger to float/double, which hardware doesn't support.

请注意,对于IEEE binary32 float ,即使 FLT_MAX (单精度)也是 (2 - 2 ^ -23)×2 ^ 127 ,这只是略低于<^ em>低于2 ^ 128,所以你可以使用128位整数作为字符串 - > float ,如果包装那么正确的 float 结果是 + Infinity FLT_MAX 位模式 0x7f7fffff :mantissa all-ones = 1.999 ... with max exponent。十进制,它是 ~3.4×10 ^ 38

Note that even FLT_MAX (single precision) for IEEE binary32 float is (2 − 2^−23) × 2^127, which is just slightly below 2^128, so you could use a 128-bit integer for string->float, and if that wraps then the correct float result is +Infinity. The FLT_MAX bit pattern is 0x7f7fffff: mantissa all-ones = 1.999... with max exponent. In decimal, it's ~3.4 × 10^38.

但如果你不关心效率,我认为你可以将每个数字转换为 float (或索引一个已经转换的 float 值的数组),并且通常的总计=总计* 10 +数字,或者在这种情况下总计=总计* 10.0 + digit_values [数字] 。 FP mul / add对于整数来说是精确的,直到两个相邻的可表示值比1.0更远(即 nextafter(total,+ Infinity) 总计+ 2.0 ),即1 ulp大于 1.0

But if you didn't care about efficiency, I think you could convert each digit to a float (or index an array of already-converted float values), and do the usual total = total*10 + digit, or in this case total = total*10.0 + digit_values[digit]. FP mul / add is exact for integers up to the point where two adjacent representable values are farther apart than 1.0 (i.e. when nextafter(total, +Infinity) is total+2.0), i.e. when 1 ulp is greater than 1.0.

实际上,要获得正确的舍入,您需要先添加小值 ,否则它们各自单独向下舍入时,它们可能会将一个较大的值提升到下一个可表示的值。

Actually, to get correct rounding you need to add the small values first, otherwise they each separately round down when all together they could have bumped a large value up to the next representable value.

因此,如果你仔细地做,你可以使用FPU,比如使用8位数的块并按10 ^ 8或其他比例缩放,并从最小的开始添加。您可以将每个8位数字符串转换为整数并使用硬件 int - > float

So you can probably use the FPU for this if you do it carefully, like working in chunks of 8 digits and scaling by 10^8 or something, and add starting with the smallest. You could convert each string of 8 digits to integer and use hardware int->float.

小数部分甚至比较棘手,特别是如果你想避免重复除以10来得到地方价值,你应该避免它,因为它很慢,因为 1/10 在二进制浮点中不能完全表示,所以你所有的地方值都会有舍入错误,如果你这样做显而易见方式。

The fractional part is even trickier, especially if you want to avoid repeated division by 10 to get the place values, which you should avoid because it's slow and because 1/10 is not exactly representable in binary floating point so all your place values will have rounding error if you do it the "obvious" way.

但是如果整数部分非常大,那么 double 的所有53个尾数位可能已经确定由整数部分组成。因此glibc检查,并且只进行大整数除法以从小数部分获得所需的位数(如果有的话)。

But if the integer part is very large, all 53 mantissa bits of the double might already be determined by the integer part. So glibc checks, and only does big-integer division to get the number of bits it needs (if any) from the fractional part.

无论如何,我强烈建议同时阅读文章。

Anyway, I highly recommend reading both articles.

BTW,见 https://en.wikipedia.org/wiki/Double-precision_floating-point_format 如果您不熟悉IEEE754 binary64的位模式,又名 double ,用于表示数字。你不需要来编写一个简单的实现,但它确实有助于理解float。使用x86 SSE,您需要知道符号位在何处实现绝对值(ANDPS)或否定(XORPS)。 使用SSE计算绝对值的最快方法 abs neg 没有特殊说明,您只需使用布尔运算来操作符号位。 (比从零减去更有效。)

BTW, see https://en.wikipedia.org/wiki/Double-precision_floating-point_format if you're not familiar the bit patterns that IEEE754 binary64, aka double, uses to represent numbers. You don't need to be to write a simplistic implementation, but it does help to understand float. And with x86 SSE, you need to know where the sign bit is to implement absolute value (ANDPS) or negation (XORPS). Fastest way to compute absolute value using SSE. There aren't special instructions for abs or neg, you just use boolean ops to manipulate the sign bit. (Much more efficient than subtracting from zero.)

如果你不关心最后的ULP是否准确(最后一个单位=尾数的最低位),然后你可以做一个更简单的算法乘以10并添加类似于字符串 - >整数,然后在最后用10的幂进行缩放。

If you don't care about being accurate to the last ULP (unit in the last place = lowest bit of the mantissa), then you can do a simpler algorithm of multiplying by 10 and adding like for string -> integer, and then scale by a power of 10 at the end.

但强大的库函数不能这样做,因为创建比最终结果大许多倍的临时值意味着它会溢出(到 + / - 无限)对于 double 可以表示范围内的某些输入。或者如果您创建较小的临时值,可能会下溢到 +/- 0.0

But a robust library function can't do that, because creating a temporary value many times larger than the final result means it will overflow (to +/- Infinity) for some inputs that are within the range that double can represent. Or possibly underflow to +/- 0.0 if you create smaller temporary values.

分别处理整数和小数部分避免溢出问题。

Handling the integer and fractional part separately avoids the overflow problem.

请参阅此C实现在codereview.SE 上有一个非常简单的乘法/加法方法的例子,可能会溢出。我只是快速浏览它,但我没有看到它分裂整数/小数部分。它只处理科学记数法 E99 或最后的任何东西,重复乘以或除以10。

See this C implementation on codereview.SE for an example of a very simple multiply/add approach that will probably overflow. I only skimmed it quickly, but I don't see it splitting integer/fractional part. It only handles scientific notation E99 or whatever at the end, with repeated multiply or divide by 10.

这篇关于如何用汇编语言输入和输出实数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆