Java 的 Scanner 对比 String.split() 对比 StringTokenizer;我应该使用哪个? [英] Java's Scanner vs String.split() vs StringTokenizer; which should I use?
问题描述
我目前正在使用 split()
来扫描一个文件,其中每行都有许多由 '~'
分隔的字符串.我在某处读到 Scanner
可以在长文件上做得更好,在性能方面,所以我想检查一下.
我的问题是:我是否必须创建两个 Scanner
实例?也就是说,一个读取一行,另一个基于该行获取分隔符的标记?如果我必须这样做,我怀疑我是否会从使用它中获得任何好处.也许我在这里遗漏了什么?
在单线程模型中围绕这些进行了一些度量,这是我得到的结果.
<前>~~~~~~~~~~~~~~~~~~时间指标~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 分词器 |String.Split() |while+SubString |扫描仪 |ScannerWithCompiledPattern ~~ 4.0 毫秒 |5.1 毫秒 |1.2 毫秒 |0.5 毫秒 |0.1 毫秒 ~~ 4.4 毫秒 |4.8 毫秒 |1.1 毫秒 |0.1 毫秒 |0.1 毫秒 ~~ 3.5 毫秒 |4.7 毫秒 |1.2 毫秒 |0.1 毫秒 |0.1 毫秒 ~~ 3.5 毫秒 |4.7 毫秒 |1.1 毫秒 |0.1 毫秒 |0.1 毫秒 ~~ 3.5 毫秒 |4.7 毫秒 |1.1 毫秒 |0.1 毫秒 |0.1 毫秒 ~____________________________________________________________________________________________________________结果是 Scanner 提供了最好的性能,现在同样需要在多线程模式下进行评估!我的一位前辈说 Tokenizer 会导致 CPU 峰值,而 String.split 不会.
I am currently using split()
to scan through a file where each line has number of strings delimited by '~'
. I read somewhere that Scanner
could do a better job with a long file, performance-wise, so I thought about checking it out.
My question is: Would I have to create two instances of Scanner
? That is, one to read a line and another one based on the line to get tokens for a delimiter? If I have to do so, I doubt if I would get any advantage from using it. Maybe I am missing something here?
Did some metrics around these in a single threaded model and here are the results I got.
~~~~~~~~~~~~~~~~~~Time Metrics~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~ Tokenizer | String.Split() | while+SubString | Scanner | ScannerWithCompiledPattern ~ ~ 4.0 ms | 5.1 ms | 1.2 ms | 0.5 ms | 0.1 ms ~ ~ 4.4 ms | 4.8 ms | 1.1 ms | 0.1 ms | 0.1 ms ~ ~ 3.5 ms | 4.7 ms | 1.2 ms | 0.1 ms | 0.1 ms ~ ~ 3.5 ms | 4.7 ms | 1.1 ms | 0.1 ms | 0.1 ms ~ ~ 3.5 ms | 4.7 ms | 1.1 ms | 0.1 ms | 0.1 ms ~ ____________________________________________________________________________________________________________
The out come is that Scanner gives the best performance, Now the same needs to be evaluated on a multithreaded mode ! One of my senior's say that the Tokenizer gives a CPU spike and String.split does not.
这篇关于Java 的 Scanner 对比 String.split() 对比 StringTokenizer;我应该使用哪个?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!