任意大小数字的基本转换(PHP) [英] Base conversion of arbitrary sized numbers (PHP)

查看:101
本文介绍了任意大小数字的基本转换(PHP)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很长的二进制字符串",就像PHP的pack函数的输出一样.

如何将该值转换为base62(0-9a-zA-Z)? 如此长的输入会导致内置的数学函数溢出,并且BCmath没有base_convert函数或任何特定的函数.我还需要一个匹配的"pack base62"功能.

解决方案

我认为这个问题背后存在误解.基本转换和编码/解码不同. base64_encode(...)的输出为 not ,不是一个大的base64数字.它是一系列离散的base64值,对应于压缩函数.这就是BC Math不起作用的原因,因为BC Math与单个大数字有关,而不是与实际上代表二进制数据的小数字组成的字符串有关.

下面是一个说明差异的示例:

base64_encode(1234) = "MTIzNA=="
base64_convert(1234) = "TS" //if the base64_convert function existed

base64编码将输入分成3个字节的组(3 * 8 = 24位),然后将每个子段转换为6位(2 ^ 6 = 64,因此为"base64" )到相应的base64字符(值是"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 +/",其中A = 0,/= 63).

在我们的示例中,base64_encode()将"1234"视为4个字符的字符串,而不是整数(因为base64_encode()不能对整数进行运算).因此它输出"MTIzNA ==",因为(在US-ASCII/UTF-8/ISO-8859-1中)"1234"是二进制的00110001 00110010 00110011 00110100.这被分解为001100(十进制12,字符"M")010011(十进制19,字符"T")001000("I")110011("z")001101("N")00.不完整,则用0填充,其值为000000("A").因为所有操作都是由3个输入字符组成的组,所以有2个组:"123"和"4".最后一组用=填充,使其长度为3个字符,因此整个输出变为"MTIzNA ==".

另一方面,

转换为base64 会采用单个整数值并将其转换为单个base64值.对于我们的示例,如果我们使用与上述相同的base64值字符串,则1234(十进制)为"TS"(base64).向后工作,从左到右:T = 19(第1列),S = 18(第0列),所以(19 * 64 ^ 1)+(18 * 64 ^ 0)= 19 * 64 + 18 = 1234 (十进制).相同的数字可以用十六进制(base16)中的"4D2"表示为 :(4 * 16 ^ 2)+(D * 16 ^ 1)+(2 * 16 ^ 0)=(4 * 256)+(13 * 16)+(2 * 1)= 1234(十进制).

不同于 encoding ,它接受字符串并对其进行更改,而 conversion 不会改变实际数字,而只是改变其显示方式.十六进制(base16)"FF"是相同的数字与十进制(base10)"255",与二进制(base2)中的"11111111"相同.如果汇率从未改变,可以将其视为货币兑换:1美元的价值与0.79英镑的价值相同(今天的汇率,但假装它从未改变).

在计算中,整数通常以二进制值形式进行操作(因为很容易构建1位算术单元,然后将它们堆叠在一起以构成32位/等算术单元).为了执行"255 + 255"(十进制)这样的简单操作,计算机首先需要将数字转换为二进制("11111111" +"11111111"),然后在算术逻辑单元(ALU)中执行操作.

几乎所有其他用途的基数纯粹是为了方便人类(表示性的)-计算机将其内部值11111111(二进制)显示为255(十进制),因为对人类进行了训练以使其能够使用十进制数字.函数base64_convert()不作为标准PHP指令库的一部分而存在,因为它对任何人都不常有用:没有多少人本机读取base64数字.相比之下,二进制1和0有时对程序员有用(我们可以像开/关开关一样使用它们!),而十六进制对于人类编辑二进制数据很方便,因为整个8位字节可以明确表示为00到FF,不会浪费太多空间.

您可能会问,如果基本转换仅用于演示,为什么BC Math存在?"这是一个公平的问题,也正是我为什么纯粹为了演示而几乎"说的原因:典型的计算机仅限于32位或64位宽的数字,它们通常足够大.有时您需要处理真正的真正大数(例如RSA模数),而这些数不适合这些寄存器. BC Math通过充当抽象层来解决此问题:它将大量数字转换为长字符串.当需要进行一些操作时,BC Math会很费力地将一长串文本分成小段,以便计算机可以处理.比本地操作要慢很多,但它可以处理任意大小的数字.

I have a long "binary string" like the output of PHPs pack function.

How can I convert this value to base62 (0-9a-zA-Z)? The built in maths functions overflow with such long inputs, and BCmath doesn't have a base_convert function, or anything that specific. I would also need a matching "pack base62" function.

解决方案

I think there is a misunderstanding behind this question. Base conversion and encoding/decoding are different. The output of base64_encode(...) is not a large base64-number. It's a series of discrete base64 values, corresponding to the compression function. That is why BC Math does not work, because BC Math is concerned with single large numbers, not strings that are in reality groups of small numbers that represent binary data.

Here's an example to illustrate the difference:

base64_encode(1234) = "MTIzNA=="
base64_convert(1234) = "TS" //if the base64_convert function existed

base64 encoding breaks the input up into groups of 3 bytes (3*8 = 24 bits), then converts each sub-segment of 6 bits (2^6 = 64, hence "base64") to the corresponding base64 character (values are "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/", where A = 0, / = 63).

In our example, base64_encode() treats "1234" as a string of 4 characters, not an integer (because base64_encode() does not operate on integers). Therefore it outputs "MTIzNA==", because (in US-ASCII/UTF-8/ISO-8859-1) "1234" is 00110001 00110010 00110011 00110100 in binary. This gets broken into 001100 (12 in decimal, character "M") 010011 (19 in decimal, character "T") 001000 ("I") 110011 ("z") 001101 ("N") 00. Since the last group isn't complete, it gets padded with 0's and the value is 000000 ("A"). Because everything is done by groups of 3 input characters, there are 2 groups: "123" and "4". The last group is padded with ='s to make it 3 chars long, so the whole output becomes "MTIzNA==".

converting to base64, on the other hand, takes a single integer value and converts it into a single base64 value. For our example, 1234 (decimal) is "TS" (base64), if we use the same string of base64 values as above. Working backward, and left-to-right: T = 19 (column 1), S = 18 (column 0), so (19 * 64^1) + (18 * 64^0) = 19 * 64 + 18 = 1234 (decimal). The same number can be represented as "4D2" in hexadecimal (base16): (4 * 16^2) + (D * 16^1) + (2 * 16^0) = (4 * 256) + (13 * 16) + (2 * 1) = 1234 (decimal).

Unlike encoding, which takes a string of characters and changes it, base conversion does not alter the actual number, just changes its presentation. The hexadecimal (base16) "FF" is the same number as decimal (base10) "255", which is the same number as "11111111" in binary (base2). Think of it like currency exchange, if the exchange rate never changed: $1 USD has the same value as £0.79 GBP (exchange rate as of today, but pretend it never changes).

In computing, integers are typically operated on as binary values (because it's easy to build 1-bit arithmetic units and then stack them together to make 32-bit/etc. arithmetic units). To do something as simple as "255 + 255" (decimal), the computer needs to first convert the numbers to binary ("11111111" + "11111111") and then perform the operation in the Arithmetic Logic Unit (ALU).

Almost all other uses of bases are purely for the convenience of humans (presentational) - computers display their internal value 11111111 (binary) as 255 (decimal) because humans are trained to operate on decimal numbers. The function base64_convert() doesn't exist as part of the standard PHP repertoire because it's not often useful to anyone: not many humans read base64 numbers natively. By contrast, binary 1's and 0's are sometimes useful for programmers (we can use them like on/off switches!), and hexadecimal is convenient for humans editing binary data because an entire 8-bit byte can be represented unambiguously as 00 through FF, without wasting too much space.

You may ask, "if base conversion is just for presentation, why does BC Math exist?" That's a fair question, and also exactly why I said "almost" purely for presentation: typical computers are limited to 32-bit or 64-bit wide numbers, which are usually plenty big enough. Sometimes you need to operate on really, really big numbers (RSA moduli for example), which don't fit in those registers. BC Math solves this problem by acting as an abstraction layer: it converts huge numbers into long strings of text. When it's time to do some operation, BC Math painstakingly breaks the long strings of text up into small chunks which the computer can handle. It's much, much slower than native operations, but it can handle arbitrary-sized numbers.

这篇关于任意大小数字的基本转换(PHP)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆