清理与对象相关的外部资源的可靠方法 [英] Reliable method of cleaning up an external resource associated with an Object

查看:106
本文介绍了清理与对象相关的外部资源的可靠方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

具体用例:对二进制数据有一个抽象,它被广泛用于处理任意大小的二进制blob。由于抽象是在虚拟机之外创建的,所以现有的实现依赖垃圾收集器来实现它们的生命周期。



现在我想要添加使用堆外存储的新实现(例如,在临时文件中)。由于存在很多使用抽象的现有代码,因此引入额外的显式生命周期管理方法是不切实际的,我不能重写每个客户端使用案例,以确保他们管理新的生命周期需求。



我可以想到两种解决方案,但无法确定哪种方法更好:

a。)的finalize()来管理关联的资源的生命周期(例如临时文件在finalize中被删除,这个似乎很容易实现。



b 。)使用引用队列和java.lang.Reference(但是哪一个,弱或幻影?)与一些额外的对象在引用入队时删除文件。这似乎是一个更多的工作来实现,我需要创建不仅是新的实现,但分离出清理数据确保清理对象不能在对象之前GC'd已经暴露给用户。



c。)其他一些我没有看到的方法?



我应该采取哪种方法(为什么我更喜欢它)?也欢迎实施提示。




编辑:需要的可靠程度 - 对于我的目的,如果临时文件是完美的 在虚拟机突然终止的情况下清理。主要的问题是,当虚拟机运行时,它可以很好地填充本地磁盘(在几天的过程中)与临时文件(这发生在我真正与apache TIKA,它提取文本时创建临时文件从某些文档类型来看,zip文件是我相信的罪魁祸首)。我定期在机器上进行清理,所以如果一个文件通过清理而下降,它并不意味着世界末日 - 只要它不在短时间内定期发生。



据我所知,finalize()可以与Oracale JRE一起工作。如果我正确解释javadoc,参考文献必须按照文档工作(在抛出OutOfMemoryError之前,没有办法清除软弱/可达的引用对象)。这意味着虚拟机可能会决定长时间不回收某个特定的对象,但在堆满时它必须最新做到这一点。这意味着在堆上只能存在有限数量的基于文件的blob。虚拟机必须在某个时候清除它们,否则它会导致内存不足。或者是否有任何漏洞允许虚拟机在不清除引用的情况下运行OOM(假设它们不再被强制引用)?




< Edit2:据我所知,在这一点上finalize()和Reference都应该足够可靠,但我收集Reference可能是更好的解决方案,因为它与GC的交互无法恢复死对象,因此它的性能影响应该更小?





Edit3:依赖VM终止或启动的解决方案方法(关闭挂钩或类似方法)对我来说没有用处,因为通常情况下虚拟机运行的时间很长(服务器环境)。 解决方案

来自有效Java 的项目:避免终结者



包含在该项目中的建议只是做了@delnan在commen中的建议t:提供明确的终止方法。提供了很多示例: InputStream.close() Graphics.dispose()等等。了解奶牛可能已经离开了那个谷仓...

无论如何,下面是关于如何用参考对象实现这个功能的草图。首先,一个二进制数据的接口:

  import java.io.IOException; 

public interface Blob {
public byte [] read()throws IOException;
public void update(byte [] data)throws IOException;
}

接下来,一个基于文件的实现:

  import java.io.File; 
import java.io.IOException;

公共类FileBlob实现Blob {

private final File file;

public FileBlob(文件文件){
super();
this.file = file;

$ b @Override
public byte [] read()throws IOException {
throw new UnsupportedOperationException();
}

@Override
public void update(byte [] data)throws IOException {
throw new UnsupportedOperationException();


然后,工厂创建并跟踪文件 -

  import java.io.File; 
import java.io.IOException;
import java.lang.ref.PhantomReference;
import java.lang.ref.Reference;
import java.lang.ref.ReferenceQueue;
import java.util.Timer;
import java.util.TimerTask;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;

public class FileBlobFactory {

private static final long TIMER_PERIOD_MS = 10000;

private final ReferenceQueue< File>队列;
private final ConcurrentMap< PhantomReference< File>,String>裁判;
私人最终定时器reaperTimer;

public FileBlobFactory(){
super();
this.queue = new ReferenceQueue< File>();
this.refs = new ConcurrentHashMap< PhantomReference< File>,String>();
this.reaperTimer = new Timer(FileBlob reaper timer,true);
this.reaperTimer.scheduleAtFixedRate(new FileBlobReaper(),TIMER_PERIOD_MS,TIMER_PERIOD_MS);
}

public Blob create()throws IOException {
文件blobFile = File.createTempFile(blob,null);
//blobFile.deleteOnExit();
字符串blobFilePath = blobFile.getCanonicalPath();
FileBlob blob = new FileBlob(blobFile);
this.refs.put(新的PhantomReference< File>(blobFile,this.queue),blobFilePath);
返回blob;
}

public void shutdown(){
this.reaperTimer.cancel();


private class FileBlobReaper extends TimerTask {
@Override
public void run(){
System.out.println(FileBlob reaper task begin );
参考<?扩展文件> ref = FileBlobFactory.this.queue.poll();
while(ref!= null){
字符串blobFilePath = FileBlobFactory.this.refs.remove(ref);
文件blobFile =新文件(blobFilePath);
boolean isDeleted = blobFile.delete();
System.out.println(FileBlob reaper deleted+ blobFile +:+ isDeleted);
ref = FileBlobFactory.this.queue.poll();
}
System.out.println(FileBlob reaper task end);
}
}
}

最后,一些人为的GC压力让事情继续下去:

  import java.io.IOException; 

public class FileBlobTest {

public static void main(String [] args){
FileBlobFactory factory = new FileBlobFactory();
for(int i = 0; i <10; i ++){
try {
factory.create();
} catch(IOException exc){
exc.printStackTrace();



while(true){
try {
Thread.sleep(5000);
System.gc();的System.gc();的System.gc();
} catch(InterruptedException exc){
exc.printStackTrace();
System.exit(1);




其中应产生如下输出:

  FileBlob收割者任务begin 
FileBlob收割者已删除C:\WINDOWS\Temp\\ \\ blob1055430495823649476.tmp:true
FileBlob收割者已删除C:\WINDOWS\Temp\blob873625122345395275.tmp:true
FileBlob收割者已删除C:\WINDOWS\Temp\blob4123088770942737465.tmp:true
FileBlob收割者删除C:\WINDOWS\Temp\blob1631534546278785404.tmp:true
FileBlob收割者删除C:\WINDOWS\Temp\blob6150533076250997032.tmp:true
FileBlob收割者删除C:\WINDOWS\Temp\blob7075872276085608840.tmp:true
FileBlob收割者删除C:\WINDOWS\Temp\blob5998579368597938203.tmp:true
FileBlob收割者删除C:\WINDOWS \Temp\blob3779536278201681316.tmp:true
FileBlob收割者已删除C:\WINDOWS\Temp\blob872039 9798060613253.tmp:true
FileBlob收割者已删除C:\WINDOWS\Temp\blob3046359448721598425.tmp:true
FileBlob收割者任务结束


Concrete use case: There is an abstraction for binary data, which is widely used to handle binary blobs of arbitrary size. Since the abstraction was created without though about things outside the VM, existing implementations rely on the garbage collector for their life cycle.

Now I want to add a new implementation that uses off-heap storage (e.g. in a temporary file). Since there is a lot of existing code that uses the abstraction, introducing additional methods for explicit life cycle management is impractical, I can't rewrite every client use case using to ensure they manage the new life cycle requirements.

I can think of two solution approaches, but cant decide which one is better:

a.) Use of finalize() to manage the associated resource's life cycle (e.g. temporary file is deleted in finalize. This seems very simple to implement.

b.) Use of a reference queue and java.lang.Reference (but which one, weak or phantom?) with some extra object that deletes the file when the reference is enqueued. This seems to be a bit more work to implement, I would need to create not only the new implementation, but separate out its cleanup data and ensure the cleanup object can't be GC'd before the object that has been exposed to the user.

c.) Some other method I haven't though of?

Which approach should I take (and why should I prefer it)? Implementation hints are also welcome.


Edit: Degree of reliaility required - for my purpose its perfectly fine if a temporary file is not cleaned up in case the VM terminated abruptly. The main concern is that while the VM runs, it could very well fill up the local disk (over the course of a few days) with temporary files (this has happened to me for real with apache TIKA, which created temporary files when extracting text from certain document types, zip files were the culprit I believe). I have a periodic cleanup scheduled on the machine, so if a file drops by cleanup it doesn't mean the end of the world - as long as it doesn't happen regularly in a short interval.

As far as I could determine finalize() works with the Oracale JRE. And if I interpret the javadocs correctly, References must work as documented (there is no way a only softly/weakly reachable reference object is not cleared before OutOfMemoryError is thrown). This would mean while the VM may decide not to reclaim a particular object for a long time, it has to do so latest when the heap gets full. In turn this means there can exist only a limited number of my file based blobs on the heap. The VM has to clean them up at some point, or it would definetly run out of memory. Or is there any loophole that allows the VM to run OOM without clearing references (assuming they aren't stronly refered anymore)?


Edit2: As far as I see it at this point both finalize() and Reference should be reliable enough for my purposes, but I gather Reference may be the better solution since its interaction with the GC can't revive dead objects and thus its performance impact should be less?


Edit3: Solution approaches which rely on VM termination or startup (shutdown hook or similar) are not of use to me, since typically the VM runs for extended periods of time (server environment).

解决方案

Here's a relevant item from Effective Java: Avoid finalizers

Contained within that item is a recommendation to do just what @delnan suggests in a comment: provide an explicit termination method. Plenty of examples provided as well: InputStream.close(), Graphics.dispose(), etc. Understand that the cows may have already left the barn on that one...

At any rate, here's a sketch of how this might be accomplished with reference objects. First, an interface for binary data:

import java.io.IOException;

public interface Blob {
    public byte[] read() throws IOException;
    public void update(byte[] data) throws IOException;
}

Next, a file-based implementation:

import java.io.File;
import java.io.IOException;

public class FileBlob implements Blob {

    private final File file;

    public FileBlob(File file) {
        super();
        this.file = file;
    }

    @Override
    public byte[] read() throws IOException {
        throw new UnsupportedOperationException();
    }

    @Override
    public void update(byte[] data) throws IOException {
        throw new UnsupportedOperationException();
    }
}

Then, a factory to create and track the file-based blobs:

import java.io.File;
import java.io.IOException;
import java.lang.ref.PhantomReference;
import java.lang.ref.Reference;
import java.lang.ref.ReferenceQueue;
import java.util.Timer;
import java.util.TimerTask;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;

public class FileBlobFactory {

    private static final long TIMER_PERIOD_MS = 10000;

    private final ReferenceQueue<File> queue;
    private final ConcurrentMap<PhantomReference<File>, String> refs;
    private final Timer reaperTimer;

    public FileBlobFactory() {
        super();
        this.queue = new ReferenceQueue<File>();
        this.refs = new ConcurrentHashMap<PhantomReference<File>, String>();
        this.reaperTimer = new Timer("FileBlob reaper timer", true);
        this.reaperTimer.scheduleAtFixedRate(new FileBlobReaper(), TIMER_PERIOD_MS, TIMER_PERIOD_MS);
    }

    public Blob create() throws IOException {
        File blobFile = File.createTempFile("blob", null);
        //blobFile.deleteOnExit();
        String blobFilePath = blobFile.getCanonicalPath();
        FileBlob blob = new FileBlob(blobFile);
        this.refs.put(new PhantomReference<File>(blobFile, this.queue), blobFilePath);
        return blob;
    }

    public void shutdown() {
        this.reaperTimer.cancel();
    }

    private class FileBlobReaper extends TimerTask {
        @Override
        public void run() {
            System.out.println("FileBlob reaper task begin");
            Reference<? extends File> ref = FileBlobFactory.this.queue.poll();
            while (ref != null) {
                String blobFilePath = FileBlobFactory.this.refs.remove(ref);
                File blobFile = new File(blobFilePath);
                boolean isDeleted = blobFile.delete();
                System.out.println("FileBlob reaper deleted " + blobFile + ": " + isDeleted);
                ref = FileBlobFactory.this.queue.poll();
            }
            System.out.println("FileBlob reaper task end");
        }
    }
}

Finally, a test that includes some artificial GC "pressure" to get things going:

import java.io.IOException;

public class FileBlobTest {

    public static void main(String[] args) {
        FileBlobFactory factory = new FileBlobFactory();
        for (int i = 0; i < 10; i++) {
            try {
                factory.create();
            } catch (IOException exc) {
                exc.printStackTrace();
            }
        }

        while(true) {
            try {
                Thread.sleep(5000);
                System.gc(); System.gc(); System.gc();
            } catch (InterruptedException exc) {
                exc.printStackTrace();
                System.exit(1);
            }
        }
    }
}

Which should produce some output like:

FileBlob reaper task begin
FileBlob reaper deleted C:\WINDOWS\Temp\blob1055430495823649476.tmp: true
FileBlob reaper deleted C:\WINDOWS\Temp\blob873625122345395275.tmp: true
FileBlob reaper deleted C:\WINDOWS\Temp\blob4123088770942737465.tmp: true
FileBlob reaper deleted C:\WINDOWS\Temp\blob1631534546278785404.tmp: true
FileBlob reaper deleted C:\WINDOWS\Temp\blob6150533076250997032.tmp: true
FileBlob reaper deleted C:\WINDOWS\Temp\blob7075872276085608840.tmp: true
FileBlob reaper deleted C:\WINDOWS\Temp\blob5998579368597938203.tmp: true
FileBlob reaper deleted C:\WINDOWS\Temp\blob3779536278201681316.tmp: true
FileBlob reaper deleted C:\WINDOWS\Temp\blob8720399798060613253.tmp: true
FileBlob reaper deleted C:\WINDOWS\Temp\blob3046359448721598425.tmp: true
FileBlob reaper task end

这篇关于清理与对象相关的外部资源的可靠方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆