Java文本文件大小(文件关闭前) [英] Java text file size (before file is closed)
问题描述
我想这样做一个星期。所以我正在收集大量的数据。测试程序3分钟,产生100MB的文本文件。我有4 TB的空间,我不能用超过这个。
另外,我不希望文本文件变得太大,因为我假设他们将变成不可打开的。
我建议打开一个文本文件,并写入HTML,经常检查它的大小。如果它变得比200MB更大,我关闭文本文件并打开另一个文件。我还需要记录一下我总共使用了多少空间,这样我就可以确保不会接近4 TB。
我现在的问题是如何在文件关闭之前检查文本文件的大小(使用FileWriter.close())。有没有这个功能,或者我应该计算写入文件的字符数,并用它来估计文件大小?
另外一个问题是:是否有方法将文本文件占用的空间最小化?我正在使用Java。
创建一个编写器来计算写入的字符数并使用它来包装 OutputStreamWriter
。
注意:将文本保存到文件的正确方法是:
new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file),encoding)));
编码很重要;它通常是UTF-8。
这个链给你两个地方,你可以注入你的包装:你可以包装作家得到的字符数或内 OutputStream
获取写入的字节。
I am collecting full HTML from a service that provides access to a very large collection of blogs and news websites. I am checking the HTML as it comes (in real-time) to see if it contains some keywords. If it contains one of the keywords, I am writing the HTML to a text file to store it.
I want to do this for a week. Therefore I am collecting a large amount of data. Testing the program for 3 minutes yielded a text file of 100MB. I have 4 TB of space, and I can't use more than this.
Also, I don't want the text files to become too large, because I assume they'll become un-openable.
What I am proposing is to open a text file, and write HTML to it, frequently checking its size. If it becomes bigger than, let's say 200MB, I close the text file and open another. I also need to keep a running log of how much space I've used in total, so that I can make sure that I don't get close to 4 TB.
The question I have at this point is how to check the size of the text file before the file has been closed (using FileWriter.close()). Is there a function for this or should I count the number of characters written to the file and use that to estimate the file size?
A separate question: are there ways of minimising the amount of space my text files take up? I am working in Java.
Create a writer which counts the number of characters written and use that to wrap your OutputStreamWriter
.
[EDIT] Note: The correct way to save text to a file is:
new BufferedWriter( new OutputStreamWriter( new FileOutputStream( file ), encoding ) ) );
The encoding is important; it's usually "UTF-8".
This chain gives you two places where you can inject your wrapper: You can wrap the writer to get the number of characters or the inner OutputStream
to get bytes written.
这篇关于Java文本文件大小(文件关闭前)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!