.NET和Java之间的子字符串操作的性能对比 [英] Comparison of substring operation performance between .NET and Java

查看:161
本文介绍了.NET和Java之间的子字符串操作的性能对比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以字符串的子字符串是一个非常常见的字符串处理操作,但我听说,有可能是在Java和.NET平台之间的性能/实现相当大的差异。具体来说,我听说在Java中, java.lang.String中提供的的时间操作,但在.NET中, System.String 提供的线性的性能子串

Taking substrings of a string is a very common string manipulation operation, but I heard that there might be considerable differences in performance/implementation between the Java and .NET platform. Specifically I heard that in Java, java.lang.String offers constant time operation for substring, but in .NET, System.String offers linear performance Substring.

难道这些真的是这样吗?可以在文档/源$ C ​​$ C,等这个确认?这是具体实施,或由语言和/或平台指定?什么是每种方法的利弊?应该怎样从一个平台一个人迁移到另一个样子的,以避免陷入任何性能缺陷?

Are these really the case? Can this be confirmed in the documentation/source code, etc? Is this implementation specific, or specified by the language and/or platform? What are the pros and cons of each approach? What should a person migrating from one platform to another look for to avoid falling into any performance pitfalls?

推荐答案

在.NET中,子串是O(n),而不是Java的O(1)。这是因为在.NET中,String对象包含了所有实际的字符数据本身 1 - 所以服用一个子涉及到新的子内复制所有数据。在Java中,可以只创建指的是原来的字符数组的新对象,用不同的起始索引和长度。

In .NET, Substring is O(n) rather than the O(1) of Java. This is because in .NET, the String object contains all the actual character data itself1 - so taking a substring involves copying all the data within the new substring. In Java, substring can just create a new object referring to the original char array, with a different starting index and length.

有每一种方法的优点和缺点:

There are pros and cons of each approach:

  • 在.NET的方法具有更好的高速缓存一致性,产生更少的对象 2 ,避免了其中一个小的子prevents一个非常大的情况的char [] 被垃圾收集。我相信,在某些情况下,它可以让互操作很容易的事,在内部。
  • 在Java的方法使服用子非常有效的,而且可能是一些其他的操作太
  • .NET's approach has better cache coherency, creates fewer objects2, and avoids the situation where one small substring prevents a very large char[] being garbage collected. I believe in some cases it can make interop very easy too, internally.
  • Java's approach makes taking a substring very efficient, and probably some other operations too

有一个在我弦文章更多的细节。

There's a little more detail in my strings article.

至于避免性能缺陷的一般问题,我想我应该有一个固定的答案准备好剪切和粘贴:确保您的建筑的是有效率的,而在最可读的方式实现它,你能够。衡量性能,并优化你找到瓶颈。

As for the general question of avoiding performance pitfalls, I think I should have a canned answer ready to cut and paste: make sure your architecture is efficient, and implement it in the most readable way you can. Measure the performance, and optimise where you find bottlenecks.

1 顺便说一句,这使得字符串很特别 - 这是其内存占用相同的CLR中由各个实例中唯一的非数组类型。

1 Incidentally, this makes string very special - it's the only non-array type whose memory footprint varies by instance within the same CLR.

2 对于小弦,这是一个巨大的胜利。这是糟糕透了,有就是所有的开销的一个的对象,但是当有参与以及一个额外的阵列,单字符字符串可能需要在Java中大约36个字节。 (这是一个手指在空中数量 - 。我不记得确切的对象的开销也将取决于您使用的VM)

2 For small strings, this is a big win. It's bad enough that there's all the overhead of one object, but when there's an extra array involved as well, a single-character string could take around 36 bytes in Java. (That's a "finger-in-the-air" number - I can't remember the exact object overheads. It will also depend on the VM you're using.)

这篇关于.NET和Java之间的子字符串操作的性能对比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆