大数组压缩 [英] Large number array compression
问题描述
我有一个javascript应用程序,发送大量的数字数据下来。然后将该数据存储在数据库中。我有大小问题(太多带宽,数据库变得太大)。我现在准备牺牲一些压缩性能。
I've got a javascript application that sends a large amount of numerical data down the wire. This data is then stored in a database. I am having size issues (too much bandwidth, database getting too big). I am now ready to sacrifice some performance for compression.
我正在考虑实现一个基数62 number.toString(62)和parseInt(压缩,62)。这肯定会减少数据的大小,但在我去之前,我做这个我认为我会把它给在这里的人,因为我知道必须有一些外箱解决方案,我没有考虑。
I was thinking of implementing a base 62 number.toString(62) and parseInt(compressed, 62). This would certainly reduce the size of the data but before I go ahead and do this I thought I would put it to the folks here as I know there must be some outside the box solution I have not considered.
基本规范是:
- 将大数字数组压缩成JSONP传输的字符串(所以我认为UTF是外出的)
- 相对较快,看起来我不期望相同的性能,我现在,但我也不想gzip压缩。
The basic specs are: - Compress large number arrays into strings for JSONP transfer (So I think UTF is out) - Be relatively fast, look I'm not expecting same performance as I have now but I also don't want gzip compression either.
任何想法将非常感激。
感谢
Guido Tapia
Guido Tapia
推荐答案
这样做可能是编码为二进制类型,如signed / unsigned ints,并手动解码,如 http ://snippets.dzone.com/posts/show/685 ,这将需要服务器端代码来创建二进制数据。
Another way of doing this might be to encode to binary types such as signed/unsigned ints, and manually decode as at http://snippets.dzone.com/posts/show/685 which would require server side code to create the binary data.
然后你可以huffman压缩或类似于RLE的内容(请参阅 http://rosettacode.org/wiki/Run-length_encoding#JavaScript ,但在IE中可能会有一些问题,无需修改)进一步压缩数据。
You could then huffman compression or something similar like RLE (see http://rosettacode.org/wiki/Run-length_encoding#JavaScript for an implementation, though it may have some issues in IE without modifying) to compress the data further.
EDIT :
或者,您可以将数字本身转换为未编码的URI字符范围内的基数(radix)(参见 http://en.wikipedia.org/wiki/Percent-encoding ),如果许多数字大于2位数,它应该工作得很好。我在 http://转换了代码code.activestate.com/recipes/111286-numeric-base-converter-that-accepts-arbitrary-digi/ from python to do this。
EDIT: Alternatively, you could convert the numbers themselves to a base (radix) in the unencoded URI character range (see http://en.wikipedia.org/wiki/Percent-encoding) which should work well if many of the numbers are larger than 2 digits. I converted the code at http://code.activestate.com/recipes/111286-numeric-base-converter-that-accepts-arbitrary-digi/ from python to do this.
它目前不处理浮动,但它可以很容易做到:
It currently doesn't handle floats, but it could be done fairly easily:
function get_map(s) {
d = {}
for (var i=0; i<s.length; i++) {
d[s.charAt(i)] = i}
d.length = s.length
d._s = s
return d}
var separate_with = '~';
var encodable = get_map('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_.'); // - is reserved for negatives obviously :-P
var base10 = get_map('0123456789')
// UNCOMMENT ME for length/speed testing in a wider base!
// You may wish to experiment with the ranges for a happy medium between bandwidth and DB space :-P
/*var encodable = ''
for (var i=1; i<128; i++) {
encodable += String.fromCharCode(i)
}
encodable = get_map(encodable)*/
function baseconvert(number, fromdigits, todigits) {
var number = String(number)
if (number.charAt(0) == '-') {
number = number.slice(1, number.length)
neg=1}
else {
neg=0}
// make an integer out of the number
var x = 0
for (var i=0; i<number.length; i++) {
var digit = number.charAt(i)
x = x*fromdigits.length + fromdigits[digit]
}
// create the result in base 'todigits.length'
res = ""
while (x>0) {
remainder = x % todigits.length
res = todigits._s.charAt(remainder) + res
x = parseInt(x/todigits.length)
}
if (neg) res = "-"+res
return res
}
function encodeNums(L) {
var r = []
for (var i=0; i<L.length; i++) {
r.push(baseconvert(L[i], base10, encodable))
}
return r.join(separate_with)
}
function decodeNums(s) {
var r = []
var s = s.split(separate_with)
for (var i=0; i<s.length; i++) {
r.push(parseInt(baseconvert(s[i], encodable, base10)))
}
return r
}
var test = [5, 654645, 24324, 652124, 65, 65289543, 65278432, 643175874158, 652754327543]
alert(encodeNums(test))
alert(decodeNums(encodeNums(test)))
这篇关于大数组压缩的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!