为什么Joda Time序列化表格这么大,该怎么办? [英] Why is Joda Time serialized form so large, and what to do about it?

查看:59
本文介绍了为什么Joda Time序列化表格这么大,该怎么办?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的机器上,以下代码段:

On my machine, the following code snippet:

DateTime now = DateTime.now();
System.out.println(now);
System.out.println("Date size:\t\t"+serialiseToArray(now).length);
System.out.println("DateString size:\t"+serialiseToArray(now.toString()).length);
System.out.println("java.util.Date size:\t"+serialiseToArray(new Date()).length);
Duration twoHours = Duration.standardHours(2);
System.out.println(twoHours);
System.out.println("Duration size:\t\t"+serialiseToArray(twoHours).length);
System.out.println("DurationString size:\t"+serialiseToArray(twoHours.toString()).length);

给出以下输出:

2013-09-09T15:07:44.642+01:00
Date size:      273
DateString size:    36
java.util.Date size:    46
PT7200S
Duration size:      107
DurationString size:    14

如您所见,org.joda.time.DateTime对象比其String形式大5倍以上,这似乎可以完美地描述它,并且等效于java.util.Date.表示2小时的Duration对象也比我预期的要大得多,因为从源头看,似乎其唯一的成员变量是单个long值.

As you can see, the org.joda.time.DateTime object is more than 5 times larger than its String form, which seems to describe it perfectly, and the java.util.Date equivalent. The Duration object representing 2 hours is also much larger than I would expect, as looking at the source it seems like its only member variable is a single long value.

为什么这些序列化的对象这么大?并有任何预先存在的解决方案来获得较小的表示形式吗?

Why are these serialized objects so large? And is there any pre-existing solution for getting a smaller representation?

serialiseToArray方法,以供参考:

The serialiseToArray method, for reference:

private static byte[] serialiseToArray(Serializable s)
{
    try
    {
        ByteArrayOutputStream byteArrayBuffer = new ByteArrayOutputStream();
        new ObjectOutputStream(byteArrayBuffer).writeObject(s);
        return byteArrayBuffer.toByteArray();
    }
    catch (IOException ex)
    {
        throw new RuntimeException(ex);
    }
}

推荐答案

序列化会有一些开销.在这种情况下,您最容易注意到的开销是在实际输出中描述了类结构.而且由于Duration具有基类(BaseDuration)和两个接口(ReadableDurationSerializable),因此该开销比Date的开销(没有基类,只有一个接口)稍微大一些. ).

Serializing has some overhead. In this instance the overhead that you notice the most is that the class structure is described in the actual output. And since Duration has a base class (BaseDuration) and two interfaces (ReadableDuration and Serializable), that overhead becomes slightly larger than the one of Date (which has no base class and just a single interface).

这些类在序列化文件中使用其完全限定的类名称进行引用,因此会创建一些字节.

Those classes are referenced using their fully-qualified class names in the serialized file and as such create quite some bytes.

好消息:每个输出流仅支付一次开销.如果序列化另一个Duration对象,则大小差异应该很小.

Good news: that overhead is only paid once per output stream. If you serialize another Duration object, the difference in size should be rather small.

我已经使用 jdeserialize项目来查看序列化Duration(请注意,此工具不需要访问.class文件,因此它转储的所有信息实际上都包含在序列化数据中):

I've used the jdeserialize project to look in the result of serializing a java.util.Date vs. a Duration (note that this tool does not need access to the .class files, so all information it dumps is actually contained in the serialized data):

java.util.Date的结果:


read: java.util.Date _h0x7e0001 = r_0x7e0000;
//// BEGIN stream content output
java.util.Date _h0x7e0001 = r_0x7e0000;
//// END stream content output

//// BEGIN class declarations (excluding array classes)
class java.util.Date implements java.io.Serializable {
}

//// END class declarations

//// BEGIN instance dump
[instance 0x7e0001: 0x7e0000/java.util.Date
  object annotations:
    java.util.Date
        [blockdata 0x00: 8 bytes]

  field data:
    0x7e0000/java.util.Date:
]
//// END instance dump

Duration的结果:


read: org.joda.time.Duration _h0x7e0002 = r_0x7e0000;
//// BEGIN stream content output
org.joda.time.Duration _h0x7e0002 = r_0x7e0000;
//// END stream content output

//// BEGIN class declarations (excluding array classes)
class org.joda.time.Duration extends org.joda.time.base.BaseDuration implements java.io.Serializable {
}

class org.joda.time.base.BaseDuration implements java.io.Serializable {
    long iMillis;
}

//// END class declarations

//// BEGIN instance dump
[instance 0x7e0002: 0x7e0000/org.joda.time.Duration
  field data:
    0x7e0001/org.joda.time.base.BaseDuration:
        iMillis: 0
    0x7e0000/org.joda.time.Duration:
]
//// END instance dump

请注意,类声明" Duration的块要长得多.这也解释了为什么序列化单个 Duration需要107个字节,而序列化两个(不同)Duration对象仅需要121个字节.

Note that the "class declaration" block is quite a bit longer for Duration. This also explains why serializing a single Duration takes 107 bytes, but serializing two (distinct) Duration objects takes only 121 bytes.

这篇关于为什么Joda Time序列化表格这么大,该怎么办?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆