将大数字(或字符串)压缩为小值 [英] Compressing big number (or string) to small value

查看:198
本文介绍了将大数字(或字符串)压缩为小值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的ASP.NET页面具有以下查询字符串参数:

  ...?ID = 1000000012,1000000021,1000000013,1000000022& ... 

这里 IDs 具有由某个分隔的数字,在这种情况下。目前有4个数字,但通常它们在 3 7 之间。



现在,我正在寻找方法将每个大数字从上面转换成最小的可能值;特别压缩 IDs 查询字符串参数的值。欢迎使用压缩每个数字算法或压缩 IDs 查询字符串参数的整个值。


  1. 编码或解码不是问题;只需压缩 IDs 查询字符串参数

  2. IDs创建一个独特的小值,然后从某个数据源检索其值超出范围。

有没有算法来压缩这么大的数字 $

解决方案

你基本上需要这么多的空间给你的数字,因为你使用基数10来代表他们。改进将是使用底座16(十六进制)。例如,您可以将255(3位数字)表示为ff(2位数字)。



您可以使用更大的数字所有有效查询字符串参数的字符集:



AZ,az,0-9,'。',' - ','〜','_ ,'+'



这是一个67个字符的基础(参见 Wikipedia on QueryString )。



查看这个SO post 来将基数10转换为任意数字。 p>

编辑:



在关联的SO职位中,查看此部分:

  string xx = IntToString(42,
new char [] {'0','1','2','3','4' ,'5','6','7','8','9',
'A','B','C','D','E','F' G,H,I,J,K,L,M,N,O,P,Q,R, ,'b','b','c','d','a','b' e,f,g,h,i,j,k,l,m,n,o,p, ,'r','s','t','u','v','w','x'

这几乎是你需要的。只需添加其缺少的几个字符即可展开它:



yz.-〜_ +



缺少一个方法来回到base 10.我不会写它:-)但过程是这样的:



定义一个计数器调用TOTAL。



查看大多数字符并找到它在数组中的位置。

TOTAL =数组中字符的位置)
示例:输入为BA1。 TOTAL现在为1(因为1在数组中的位置1)

现在看看第一个字符左边的下一个字符,并找到它在数组中的位置。
TOTAL + = 47 *(数组中字符的位置)
示例:输入为BA1。 TOTAL现在是(47 * 11)+ 1 = 518



现在看看上一个字符的下一个字符,并找到它在数组中的位置。
TOTAL + = 47 * 47 *(数组中字符的位置)
示例:输入为BA1。总计现在是(47 * 47 * 10)+(47 * 11)+ 1 = 243508



等等。


$ b b

我建议您编写一个单元测试,将一组10个基数转换为基本47,然后再次返回,以确保您的转换代码正常工作。



注意你如何在基数47的3位数字中表示一个6位数的基数10: - )


My ASP.NET page has following query string parameter:

…?IDs=1000000012,1000000021,1000000013,1000000022&...

Here IDs parameter will always have numbers separated by something, in this case ,. Currently there are 4 numbers but normally they would be in between 3 and 7.

Now, I am looking for method to convert each big number from above into smallest possible value; specifically compressing value of IDs query string parameter. Both, compressing each number algorithm or compressing whole value of IDs query string parameter are welcome.

  1. Encode or decode is not an issue; just compressing the value IDs query string parameter.
  2. Creating some unique small value for IDs and then retrieving its value from some data source is out of scope.

Is there an algorithm to compress such big numbers to small values or to compress value of the IDs query string parameter all together?

解决方案

You basically need so much room for your numbers because you are using base 10 to represent them. An improvement would be to use base 16 (hex). So for example, you could represent 255 (3 digits) as ff (2 digits).

You can take that concept further by using a much larger number base... the set of all characters that are valid query string parameters:

A-Z, a-z, 0-9, '.', '-', '~', '_', '+'

That gives you a base of 67 characters to work with (see Wikipedia on QueryString).

Have a look at this SO post for approaches to converting base 10 to arbitrary number bases.

EDIT:

In the linked SO post, look at this part:

string xx = IntToString(42, 
            new char[] { '0','1','2','3','4','5','6','7','8','9',
            'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z',
            'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x'});

That's almost what you need. Just expand it by adding the few characters it is missing:

yz.-~_+

That post is missing a method to go back to base 10. I'm not going to write it :-) but the procedure is like this:

Define a counter I'll call TOTAL.

Look at the right most character and find it's position in the array.
TOTAL = (the position of the character in the array) Example: Input is BA1. TOTAL is now 1 (since "1" is in position 1 in the array)

Now look at the next character left of the first one and find it's position in the array. TOTAL += 47 * (the position of the character in the array) Example: Input is BA1. TOTAL is now (47 * 11) + 1 = 518

Now look at the next character left of the previous one and find it's position in the array. TOTAL += 47 * 47 * (the position of the character in the array) Example: Input is BA1. Total is now (47 * 47 * 10) + (47 * 11) + 1 = 243508

And so on.

I suggest you write a unit test that converts a bunch of base 10 numbers into base 47 and then back again to make sure your conversion code works properly.

Note how you represented a 6 digit base 10 number in just 3 digits of base 47 :-)

这篇关于将大数字(或字符串)压缩为小值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆