避免Java中重复的字符串 [英] Avoid duplicate Strings in Java

查看:196
本文介绍了避免Java中重复的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想提出一个关于避免Java中的String重复的问题。

I want to ask a question about avoiding String duplicates in Java.

上下文是:具有标签和属性的XML一个:

The context is: an XML with tags and attributes like this one:

<product id="PROD" name="My Product"...></product>

使用JibX,此类XML中的XML被编组/解组,如下所示:

With JibX, this XML is marshalled/unmarshalled in a class like this:

public class Product{
private String id;
private String name;
// constructor, getters, setters, methods  and so on
}

该程序是一个长时间的批处理,所以产品对象被创建,使用,复制等。

The program is a long-time batch processing, so Product objects are created, used, copied, etc.

那么问题是:
当我使用Eclipse内存分析器(MAT)这样的软件分析执行时,我发现了几个重复的字符串。例如,在id属性中, PROD 值在2000个实例等之间重复。

Well, the question is: When I analysed the execution with software like Eclipse memory analyzer (MAT), I found several duplicated Strings. For example, in the id attribute, the PROD value is duplicated around 2000 instances, etc.

我如何避免这种情况?产品类中的其他属性可能会沿着执行方式改变其价值,但是像 id 名称 ...这样做不会如此频繁地更改。

How can I avoid this situation? Other attributes in Product class may change their value along the execution, but attrs like id, name... don't change so frequently.

我已经阅读了关于 String.intern()方法的内容,但是我还没有使用,我不知道这是一个解决方案。我可以定义这些属性中最常见的值,如类中的 static final 常量吗?

I have readed something about String.intern() method, but I haven't used yet and I'm not sure it's a solution for this. Could I define the most frequent values in those attributes like static final constants in the class?

我希望我能在一个正确的方法。
任何帮助或建议非常感激。感谢提前。

I hope I'd have expressed my question in a right way. Any help or advice is very appreciated. Thanks in advance.

推荐答案

interning 将是正确的解决方案,如果你真的有问题。 Java将字符串文字和许多其他字符串存储在内部池中,并且每当创建新的String即将被创建时,JVM首先检查String是否已经在池中。如果是,它不会创建一个新的实例,但传递对 interned String对象的引用。

interning would be the right solution, if you really have a problem. Java stores String literals and a lot of other Strings in an internal pool and whenever a new String is about to be created, the JVM first checks, if the String is already in the pool. If yes, it will not create a new instance but pass the reference to the interned String object.

有两种方法可以控制此行为:

There are two ways to control this behaviour:

String interned = String.intern(aString); // returns a reference to an interned String
String notInterned = new String(aString); // creates a new String instance (guaranteed)

所以可能库真的为所有xml属性值创建新的实例。这是可能的,您将无法更改。

So maybe, the libraries really create new instances for all xml attribute values. This is possible and you won't be able to change it.

实习生全球效应。一个interned的字符串是立即可用的任何对象(这个观点没有什么意义,但它可能有助于理解它)。

intern has a global effect. An interned String is immediatly available "for any object" (this view doesn't really make sense, but it may help to understand it).

所以,让我们说在课程中有一行 Foo ,方法愚蠢

So, lets say we have a line in class Foo, method foolish:

String s = "ABCD";

字符串文字被立即实行。 JVM检查,如果ABCD已经在池中,否则,ABCD存储在池中。 JVM分配对interned String的引用到 s

String literals are interned immediatly. JVM checks, if "ABCD" is already in the pool, if not, "ABCD" is stored in the pool. The JVM assigns a reference to the interned String to s.

现在,也许在另一个类 Bar ,方法 barbar

Now, maybe in another class Bar, in method barbar:

String t = "AB"+"CD";

然后,JVM将如上所述实习AB和CD,创建连接的字符串,查看如果已经有了,嘿,是的,并且将引用的内部字符串ABCD分配给 t

Then the JVM will intern "AB" and "CD" like above, create the concatenated String, look, if it is intered already, Hey, yes it is, and assign the reference to the interned String "ABCD" to t.

调用PROD.intern()可能会工作或失败。是的,它将实习String PROD。但是有一个机会,jibx真的为属性值创建了新的字符串,具有

Calling "PROD".intern() may work or fail. Yes, it will intern the String "PROD". But there's a chance, that jibx really creates new Strings for attribute values with

String value = new String(getAttributeValue(attribute));

在这种情况下,不会引用一个interned String (即使PROD在池中),但引用堆上的新String实例。

In that case, value will not have a reference to an interned String (even if "PROD" is in the pool) but a reference to a new String instance on the heap.

而且,对于您的命令中的另一个问题:这仅在运行时发生。编译简单地创建类文件,String池是对象堆上的一个数据结构,由JVM使用的,用于执行应用程序。

And, to the other question in your command: this happens at runtime only. Compiling simply creates class files, the String pool is a datastructure on the object heap and that is used by the JVM, that executes the application.

这篇关于避免Java中重复的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆