Java的Scanner vs String.split()vs StringTokenizer;我应该用哪个? [英] Java's Scanner vs String.split() vs StringTokenizer; which should I use?
问题描述
我目前正在使用 split()
来浏览一个文件,其中每一行都有'〜'$ c分隔的字符串数$ C>。我读到某个地方
Scanner
可以用一个长文件做得更好,性能方面,所以我想考虑一下。
I am currently using split()
to scan through a file where each line has number of strings delimited by '~'
. I read somewhere that Scanner
could do a better job with a long file, performance-wise, so I thought about checking it out.
我的问题是:我是否必须创建两个 Scanner
的实例?也就是说,一个读取一行而另一个基于该行来获取分隔符的标记?如果我必须这样做,我怀疑我是否会从使用它中获得任何好处。也许我在这里遗漏了一些东西?
My question is: Would I have to create two instances of Scanner
? That is, one to read a line and another one based on the line to get tokens for a delimiter? If I have to do so, I doubt if I would get any advantage from using it. Maybe I am missing something here?
推荐答案
在单线程模型中有一些关于这些的指标,这里是我得到的结果。
Did some metrics around these in a single threaded model and here are the results I got.
~~~~~~~~~~~~~~~~~~Time Metrics~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ Tokenizer | String.Split() | while+SubString | Scanner | ScannerWithCompiledPattern ~
~ 4.0 ms | 5.1 ms | 1.2 ms | 0.5 ms | 0.1 ms ~
~ 4.4 ms | 4.8 ms | 1.1 ms | 0.1 ms | 0.1 ms ~
~ 3.5 ms | 4.7 ms | 1.2 ms | 0.1 ms | 0.1 ms ~
~ 3.5 ms | 4.7 ms | 1.1 ms | 0.1 ms | 0.1 ms ~
~ 3.5 ms | 4.7 ms | 1.1 ms | 0.1 ms | 0.1 ms ~
____________________________________________________________________________________________________________
出来的是Scanner提供最佳性能,现在同样需要评估多线程模式!我的一位大四学生说Tokenizer给出了CPU峰值而String.split没有。
The out come is that Scanner gives the best performance, Now the same needs to be evaluated on a multithreaded mode ! One of my senior's say that the Tokenizer gives a CPU spike and String.split does not.
这篇关于Java的Scanner vs String.split()vs StringTokenizer;我应该用哪个?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!