Java阅读文件的方法不同 [英] Java reading a file different methods
问题描述
BufferedReader
, DataInputStream
等)。 )我个人最喜欢的是 Scanner
在构造函数中带有 File
>(它更简单一些,并且有熟悉的语法)。Boris the Spider也提到
Channel
和 RandomAccessFile code>。
有人可以解释每种方法的优缺点吗?具体来说,我什么时候要使用每一个?
(编辑)我想我应该是具体的,并补充说,我有一个强烈的偏好扫描仪
方法。所以真正的问题是,何时我不想使用它?
解决方案让我们开始在开始。问题是你想要做什么?
理解文件实际是什么非常重要。文件是光盘上字节的集合,这些字节是你的数据。 Java提供了多种抽象级别:
-
File(Input | Output)Stream
- 将这些字节作为字节
的流读取。
-
File(Reader | Writer) - 从字节流中读取,作为 char
的流
。
-
扫描程序
- 从 char
的流中读取并标记。
-
RandomAccessFile
- 将这些字节读取为可搜索的 byte []
。
-
FileChannel
- 以安全的多线程方式读取这些字节。
例如,您可以使用 BufferedXXX
添加缓冲区。您可以使用 PrintWriter
将换行感知添加到 FileWriter
中。您可以使用 InputStreamReader
将 InputStream
转换为 Reader
(现在是为 Reader
指定字符编码的唯一方法)。
$ b 所以 - when wouldn'我想用它[a code> Scanner
]?。你不会使用 Scanner
如果你想的话,(这是一些例子):
$ b $ ol
字节
s 扫描程序(文件文件)
构造函数使用 File
并打开 FileInputStream
与平台默认编码 - 这几乎总是一个坏的想法。人们普遍认识到,你应该明确指定编码,以避免令人讨厌的基于编码的错误。进一步的流不被缓冲。
所以你可能会更好的与
try(final Scanner scanner = new Scanner(new BufferedInputStream(new FileInputStream())),UTF-8){
// do stuff
}
丑陋的,我知道。
值得注意的是,Java 7提供了一个进一步的抽象层去除需要循环的文件 - 这些在文件 class:
byte [] Files.readAllBytes(Path path)
List< String> Files.readAllLines(Path path,Charset cs)
这两个方法都将整个文件读入内存,可能不合适。在Java 8中,通过添加对新的 Stream
API的支持,进一步改进了它:
流<字符串> Files.lines(Path path,Charset cs)
Stream< Path> Files.list(路径目录)
例如获取路径的单词download.java.net/jdk8/docs/api/java/util/stream/Stream.html\">Stream 你可以这样做:
final Stream< String> words = Files.lines(Paths.get(myFile.txt))。
flatMap((in) - > Arrays.stream(in.split(\\b)));
It seems that there are many, many ways to read text files in Java (BufferedReader
, DataInputStream
etc.) My personal favorite is Scanner
with a File
in the constructor (it's just simpler, works with mathy data processing better, and has familiar syntax).
Boris the Spider also mentioned Channel
and RandomAccessFile
.
Can someone explain the pros and cons of each of these methods? To be specific, when would I want to use each?
(edit) I think I should be specific and add that I have a strong preference for the Scanner
method. So the real question is, when wouldn't I want to use it?
Lets start at the beginning. The question is what do you want to do?
It's important to understand what a file actually is. A file is a collection of bytes on a disc, these bytes are your data. There are various levels of abstraction above that that Java provides:
File(Input|Output)Stream
- read these bytes as a stream ofbyte
.File(Reader|Writer)
- read from a stream of bytes as a stream ofchar
.Scanner
- read from a stream ofchar
and tokenise it.RandomAccessFile
- read these bytes as a searchablebyte[]
.FileChannel
- read these bytes in a safe multithreaded way.
On top of each of those there are the Decorators, for example you can add buffering with BufferedXXX
. You could add linebreak awareness to a FileWriter
with PrintWriter
. You could turn an InputStream
into a Reader
with an InputStreamReader
(currently the only way to specify character encoding for a Reader
).
So - when wouldn't I want to use it [a Scanner
]?.
You would not use a Scanner
if you wanted to, (these are some examples):
- Read in data as
byte
s - Read in a serialized Java object
- Copy
byte
s from one file to another, maybe with some filtering.
It is also worth nothing that the Scanner(File file)
constructor takes the File
and opens a FileInputStream
with the platform default encoding - this is almost always a bad idea. It is generally recognised that you should specify the encoding explicitly to avoid nasty encoding based bugs. Further the stream isn't buffered.
So you may be better off with
try (final Scanner scanner = new Scanner(new BufferedInputStream(new FileInputStream())), "UTF-8") {
//do stuff
}
Ugly, I know.
It's worth noting that Java 7 Provides a further layer of abstraction to remove the need to loop over files - these are in the Files class:
byte[] Files.readAllBytes(Path path)
List<String> Files.readAllLines(Path path, Charset cs)
Both these methods read the entire file into memory, which might not be appropriate. In Java 8 this is further improved by adding support for the new Stream
API:
Stream<String> Files.lines(Path path, Charset cs)
Stream<Path> Files.list(Path dir)
For example to get a Stream of words from a Path
you can do:
final Stream<String> words = Files.lines(Paths.get("myFile.txt")).
flatMap((in) -> Arrays.stream(in.split("\\b")));
这篇关于Java阅读文件的方法不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!