如何从文件内容创建Java字符串? [英] How do I create a Java string from the contents of a file?

查看:203
本文介绍了如何从文件内容创建Java字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

现在我已经用了一段时间以下的习语了。这似乎是最广泛的,至少在我访问过的网站。

有一个更好/不同的方式来读取文件到一个字符串在Java中?

  private String readFile(String file)throws IOException {
BufferedReader reader = new BufferedReader(new FileReader文件));
String line = null;
StringBuilder stringBuilder = new StringBuilder();
String ls = System.getProperty(line.separator); $(b)
$ b尝试{
while((line = reader.readLine())!= null){
stringBuilder.append(line);
stringBuilder.append(ls);
}

return stringBuilder.toString();
} finally {
reader.close();



$ div $解析方案

阅读所有来自文件的文本



下面是一个紧凑而强大的用于Java 7的习惯用法,包含在一个实用程序方法中:

< pre $ 静态字符串readFile ));
返回新的字符串(编码,编码);





从文件中读取文本行



Java 7添加了一个用文本行读取文件的方便方法,表示为 List< String> / code>。这种方法是有损的,因为行分隔符从每行的末尾被剥离。

  List< String> lines = Files.readAllLines(Paths.get(path),encoding); 

在Java 8中, BufferedReader 方法,请 () 来产生一个流< String> 。如果在读取文件时遇到 IOException ,它将被封装在 UncheckedIOException ,因为 Stream 不接受抛出检查异常的lambdas。

  try(BufferedReader r = Files.newBufferedReader(path,encoding)){
r.lines()。forEach(System.out :: println);

$ / code>



内存利用率



第一种保存换行符的方法可以暂时需要几倍于文件大小的内存,因为在短时间内原始文件内容(一个字节数组)和解码字符(每个字节是16位,即使在文件中被编码为8位)一次驻留在存储器中。第二种方法,即读取行,通常更具有内存效率,因为输入用于解码的字节缓冲器不需要包含整个文件。但是,它仍然不适合相对于可用内存非常大的文件。



对于阅读大文件,您需要为您的程序设计不同的设计,来自流的文本块,处理它,然后移动到下一个,重新使用相同的固定大小的内存块。这里大取决于电脑的规格。如今,这个门槛可能是很多千兆字节的RAM。第三种方法是使用 Stream< String> 来做到这一点,如果你的输入records碰巧是单独的行。 (使用 BufferedReader readLine()方法与此方法相当。)



字符编码



原始文章中的示例中缺少的一件事是字符编码。有一些特殊情况下,平台默认是你想要的,但是它们很少,你应该能够证明你的选择。

StandardCharsets 类为所有Java运行时所需的编码定义了一些常量:

  String content = readFile(test.txt, StandardCharsets.UTF_8); 

平台默认值可从 Charset class 本身:

  String content = readFile(test.txt,Charset.defaultCharset()); 






注意:这个回答在很大程度上取代了我的Java 6版本。 Java 7的实用程序安全地简化了代码,而使用映射字节缓冲区的旧答案阻止了从被删除的文件被删除,直到映射的缓冲区被垃圾收集为止。您可以通过此答案的编辑链接查看旧版本。


I've been using the idiom below for some time now. And it seems to be the most wide-spread, at least on the sites I've visited.

Is there a better/different way to read a file into a string in Java?

private String readFile(String file) throws IOException {
    BufferedReader reader = new BufferedReader(new FileReader (file));
    String         line = null;
    StringBuilder  stringBuilder = new StringBuilder();
    String         ls = System.getProperty("line.separator");

    try {
        while((line = reader.readLine()) != null) {
            stringBuilder.append(line);
            stringBuilder.append(ls);
        }

        return stringBuilder.toString();
    } finally {
        reader.close();
    }
}

解决方案

Read all text from a file

Here's a compact, robust idiom for Java 7, wrapped up in a utility method:

static String readFile(String path, Charset encoding) 
  throws IOException 
{
  byte[] encoded = Files.readAllBytes(Paths.get(path));
  return new String(encoded, encoding);
}

Read lines of text from a file

Java 7 added a convenience method to read a file as lines of text, represented as a List<String>. This approach is "lossy" because the line separators are stripped from the end of each line.

List<String> lines = Files.readAllLines(Paths.get(path), encoding);

In Java 8, BufferedReader added a new method, lines() to produce a Stream<String>. If an IOException is encountered while reading the file, it is wrapped in an UncheckedIOException, since Stream doesn't accept lambdas that throw checked exceptions.

try (BufferedReader r = Files.newBufferedReader(path, encoding)) {
  r.lines().forEach(System.out::println);
}

Memory utilization

The first method, that preserves line breaks, can temporarily require memory several times the size of the file, because for a short time the raw file contents (a byte array), and the decoded characters (each of which is 16 bits even if encoded as 8 bits in the file) reside in memory at once. It is safest to apply to files that you know to be small relative to the available memory.

The second method, reading lines, is usually more memory efficient, because the input byte buffer for decoding doesn't need to contain the entire file. However, it's still not suitable for files that are very large relative to available memory.

For reading large files, you need a different design for your program, one that reads a chunk of text from a stream, processes it, and then moves on to the next, reusing the same fixed-sized memory block. Here, "large" depends on the computer specs. Nowadays, this threshold might be many gigabytes of RAM. The third method, using a Stream<String> is one way to do this, if your input "records" happen to be individual lines. (Using the readLine() method of BufferedReader is the procedural equivalent to this approach.)

Character encoding

One thing that is missing from the sample in the original post is the character encoding. There are some special cases where the platform default is what you want, but they are rare, and you should be able justify your choice.

The StandardCharsets class define some constants for the encodings required of all Java runtimes:

String content = readFile("test.txt", StandardCharsets.UTF_8);

The platform default is available from the Charset class itself:

String content = readFile("test.txt", Charset.defaultCharset());


Note: This answer largely replaces my Java 6 version. The utility of Java 7 safely simplifies the code, and the old answer, which used a mapped byte buffer, prevented the file that was read from being deleted until the mapped buffer was garbage collected. You can view the old version via the "edited" link on this answer.

这篇关于如何从文件内容创建Java字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆