Java文本文件大小(文件关闭前) [英] Java text file size (before file is closed)

查看:137
本文介绍了Java文本文件大小(文件关闭前)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在从一个服务中收集完整的HTML,这个服务提供对大量博客和新闻网站的访问。我正在检查HTML(实时),看它是否包含一些关键字。如果它包含其中一个关键字,我将HTML写入一个文本文件来存储它。



我想这样做一个星期。所以我正在收集大量的数据。测试程序3分钟,产生100MB的文本文件。我有4 TB的空间,我不能用超过这个。



另外,我不希望文本文件变得太大,因为我假设他们将变成不可打开的。



我建议打开一个文本文件,并写入HTML,经常检查它的大小。如果它变得比200MB更大,我关闭文本文件并打开另一个文件。我还需要记录一下我总共使用了多少空间,这样我就可以确保不会接近4 TB。



我现在的问题是如何在文件关闭之前检查文本文件的大小(使用FileWriter.close())。有没有这个功能,或者我应该计算写入文件的字符数,并用它来估计文件大小?

另外一个问题是:是否有方法将文本文件占用的空间最小化?我正在使用Java。

解决方案

创建一个编写器来计算写入的字符数并使用它来包装 OutputStreamWriter



注意:将文本保存到文件的正确方法是:

  new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file),encoding))); 

编码很重要;它通常是UTF-8。

这个链给你两个地方,你可以注入你的包装:你可以包装作家得到的字符数或内 OutputStream 获取写入的字节。


I am collecting full HTML from a service that provides access to a very large collection of blogs and news websites. I am checking the HTML as it comes (in real-time) to see if it contains some keywords. If it contains one of the keywords, I am writing the HTML to a text file to store it.

I want to do this for a week. Therefore I am collecting a large amount of data. Testing the program for 3 minutes yielded a text file of 100MB. I have 4 TB of space, and I can't use more than this.

Also, I don't want the text files to become too large, because I assume they'll become un-openable.

What I am proposing is to open a text file, and write HTML to it, frequently checking its size. If it becomes bigger than, let's say 200MB, I close the text file and open another. I also need to keep a running log of how much space I've used in total, so that I can make sure that I don't get close to 4 TB.

The question I have at this point is how to check the size of the text file before the file has been closed (using FileWriter.close()). Is there a function for this or should I count the number of characters written to the file and use that to estimate the file size?

A separate question: are there ways of minimising the amount of space my text files take up? I am working in Java.

解决方案

Create a writer which counts the number of characters written and use that to wrap your OutputStreamWriter.

[EDIT] Note: The correct way to save text to a file is:

new BufferedWriter( new OutputStreamWriter( new FileOutputStream( file ), encoding ) ) );

The encoding is important; it's usually "UTF-8".

This chain gives you two places where you can inject your wrapper: You can wrap the writer to get the number of characters or the inner OutputStream to get bytes written.

这篇关于Java文本文件大小(文件关闭前)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆