比较JSON和BSON [英] Compare JSON and BSON

查看:1600
本文介绍了比较JSON和BSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我比较JSON和BSON序列化对象。这些对象包含了大量的整数多个阵列。在我的测试,我序列化对象包含约12.000整数的总数。我感兴趣的只是如何大小的序列化的结果进行比较。我使用JSON.NET作为它执行序列化图书馆。我使用JSON,因为我也希望能够与它的工作在Javascript。

I am comparing JSON and BSON for serializing objects. These objects contain several arrays of a large number of integers. In my test the object I am serializing contain a total number of about 12.000 integers. I am only interested in how the sizes compare of the serialized results. I am using JSON.NET as the library which does the serialization. I am using JSON because I also want to be able to work with it in Javascript.

JSON字符串的大小为约43KB和的BSON结果的大小是161KB。因此,大约4这是一个不同的因素不是我所期待的,因为我看着BSON因为我以为BSON是在存储数据更加有效。

The size of the JSON string is about 43kb and the size of the BSON result is 161kb. So a difference factor of about 4. This is not what I expected because I looked at BSON because I thought BSON is more efficient in storing data.

所以我的问题是,为什么是BSON效率不高,是否可以更有效率?或者有没有用含大量的整数,它可以在Javascript好办阵列序列数据的另一种方式?

So my question is why is BSON not efficient, can it be made more efficient? Or is there another way of serializing data with arrays containing large number of integers, which can be easily handled in Javascript?

下面你可以看到code测试JSON / BSON序列化。

Below you find the code to test the JSON/BSON serialization.

        // Read file which contain json string
        string _jsonString = ReadFile();
        object _object = Newtonsoft.Json.JsonConvert.DeserializeObject(_jsonString);
        FileStream _fs = File.OpenWrite("BsonFileName");
        using (Newtonsoft.Json.Bson.BsonWriter _bsonWriter = new BsonWriter(_fs) 
               { CloseOutput = false })
        {
            Newtonsoft.Json.JsonSerializer _jsonSerializer = new JsonSerializer();
            _jsonSerializer.Serialize(_bsonWriter, _object);
            _bsonWriter.Flush();
        }

在此先感谢

罗纳德

编辑:

下面是生成的文件 <一href="https://skydrive.live.com/redir?resid=9A6F31F60861DD2C!362&authkey=!AKU-ZZp8C_0gcR0">https://skydrive.live.com/redir?resid=9A6F31F60861DD2C!362&authkey=!AKU-ZZp8C_0gcR0

推荐答案

JSON的VS BSON的效率取决于你存储的整数的大小。还有一个有趣的点,ASCII花费的时间比实际存储整数类型的字节数更少。 64位整数,这是怎么出现在BSON文件,占用8个字节。你的数字都是低于10000,这意味着你可以存储每一个ASCII码的4个字节(一个字节为每个字符最多到9999)。事实上,大部分的数据看起来就像是不到1000,这意味着它可以存放在3个或更少的字节。当然,反序列化需要时间,并不便宜,但它节省了空间。此外,JavaScript使用64位值重新present所有的数字,所以如果你每个整数转换为更合适的DATAFORMAT后写来BSON,您BSON文件可能会更大。

The efficiency of JSON vs BSON depends on the size of the integers you're storing. There's an interesting point where ASCII takes fewer bytes than actually storing integer types. 64-bit integers, which is how it appears your BSON document, take up 8 bytes. Your numbers are all less than 10,000, which means you could store each one in ASCII in 4 bytes (one byte for each character up through 9999). In fact, most of your data look like it's less than 1000, meaning it can be stored in 3 or fewer bytes. Of course, that deserialization takes time and isn't cheap, but it saves space. Furthermore, Javascript uses 64-bit values to represent all numbers, so if you wrote it to BSON after converting each integer to a more appropriate dataformat, your BSON file could be much larger.

根据规范,BSON含有大量的元数据的JSON没有。这种元数据主要是长度为prefixes这样就可以通过数据跳过你不感兴趣的。例如,采取以下数据:

According to the spec, BSON contains a lot of metadata that JSON doesn't. This metadata is mostly length prefixes so that you can skip through data you aren't interested in. For example, take the following data:

["hello there, this is an necessarily long string.  It's especially long, but you don't care about it. You're just trying to get to the next element. But I keep going on and on.",
 "oh man. here's another string you still don't care about.  You really just want the third element in the array.  How long are the first two elements? JSON won't tell you",
 "data_you_care_about"]

现在,如果你使用JSON,你必须分析前两个字符串的全部找出第三个是。如果你使用BSON,你会得到的标记更喜欢(但实际上没有,因为我在做这个标记了,为了举例):

Now, if you're using JSON, you have to parse the entirety of the first two strings to find out where the third one is. If you use BSON, you'll get markup more like (but not actually, because I'm making this markup up for the sake of example):

[175 "hello there, this is an necessarily long string.  It's especially long, but you don't care about it. You're just trying to get to the next element. But I keep going on and on.",
 169 "oh man. here's another string you still don't care about.  You really just want the third element in the array.  How long are the first two elements? JSON won't tell you",
 19 "data_you_care_about"]

所以,现在,你可以阅读'175',知道要向前跳过175个字节,然后读'169',快进169个字节,然后读取'19',下一个19个字节复制到您的字符串。这样,你甚至不必解析字符串分隔符。

So now, you can read '175', know to skip forward 175 bytes, then read '169', skip forward 169 bytes, and then read '19' and copy the next 19 bytes to your string. That way you don't even have to parse the strings for delimiters.

使用一比另一种是非常依赖于你的需求是什么。如果你将要存储你已经得到了所有的时间在世界上解析巨大的文件,但你的磁盘空间是有限的,使用JSON,因为它更紧凑,节省空间。 如果你将要存储的文件,但减少了等待时间(也许在一个服务器环境)比节省一些磁盘空间,更重要的是你,使用BSON。

Using one versus the other is very dependent on what your needs are. If you're going to be storing enormous documents that you've got all the time in the world to parse, but your disk space is limited, use JSON because it's more compact and space efficient. If you're going to be storing documents, but reducing wait time (perhaps in a server context) is more important to you than saving some disk space, use BSON.

另一件事情在你的选择要考虑的是人的可读性。如果您需要调试包含BSON崩溃报告,你可能需要一个工具来破解它。你可能不只是知道BSON,但你可以只读取JSON。

Another thing to consider in your choice is human readability. If you need to debug a crash report that contains BSON, you'll probably need a utility to decipher it. You probably don't just know BSON, but you can just read JSON.

常见问题解答

这篇关于比较JSON和BSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆