如何使用 Hibernate 将数据流式传输到数据库 BLOB(字节 [] 中无内存存储) [英] How to stream data to database BLOB using Hibernate (no in-memory storing in byte[])

查看:30
本文介绍了如何使用 Hibernate 将数据流式传输到数据库 BLOB(字节 [] 中无内存存储)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种将二进制数据传入/传出数据库的方法.如果可能,我希望用 Hibernate 来完成(以与数据库无关的方式).我发现的所有解决方案都涉及将二进制数据作为字节 [] 显式或隐式加载到内存中.我需要避免它.假设我希望我的代码能够将数据库中的 2GB 视频(存储在 BLOB 列中)写入本地文件,或者相反,使用不超过 256Mb 的内存.这显然是可以实现的,并且不涉及巫毒教.但是我找不到方法,现在我正在尝试避免调试 Hibernate.

I'm looking for a way to stream binary data to/from database. If possible, i'd like it to be done with Hibernate (in database agnostic way). All solutions I've found involve explicit or implicit loading of binary data into memory as byte[]. I need to avoid it. Let's say I want my code to be able to write to a local file a 2GB video from database (stored in BLOB column), or the other way around, using no more than 256Mb of memory. It's clearly achievable, and involves no voodoo. But I can't find a way, for now I'm trying to avoid debugging Hibernate.

让我们看一下示例代码(记住 -Jmx=256Mb).

Let's look at sample code (keeping in mind -Jmx=256Mb).

实体类:

public class SimpleBean {
    private Long id;
    private Blob data;
    // ... skipping getters, setters and constructors.
}

Hibernate 映射片段:

Hibernate mapping fragment:

<class name="SimpleBean" table="SIMPLE_BEANS">
    <id name="id" column="SIMPLE_BEAN_ID">
        <generator class="increment" />
    </id>
    <property name="data" type="blob" column="DATA" />
</class>

测试代码片段:

Configuration cfg = new Configuration().configure("hibernate.cfg.xml");
ServiceRegistry serviceRegistry = new ServiceRegistryBuilder()
                                      .applySettings(cfg.getProperties())
                                      .buildServiceRegistry();

SessionFactory sessionFactory = cfg.buildSessionFactory(serviceRegistry);
Session session = sessionFactory.openSession();
session.beginTransaction();

File dataFile = new File("movie_1gb.avi");
long dataSize = dataFile.length();
InputStream dataStream = new FileInputStream(dataFile);

LobHelper lobHelper = session.getLobHelper();
Blob dataBlob = lobHelper.createBlob(dataStream, dataSize);

session.save( new SimpleBean(data) );
session.getTransaction().commit(); // Throws java.lang.OutOfMemoryError
session.close();

blobStream.close();
sessionFactory.close();

运行该代码段时,出现 OutOfMemory 异常.查看堆栈跟踪显示了 Hibernate 尝试将流加载到内存中并获取 OutOfMemory(应该如此)的内容.这是堆栈跟踪:

When running that snippet I get OutOfMemory exception. Looking at stack trace shows what Hibernate tries to load the stream in memory and gets OutOfMemory (as it should). Here's stack trace:

java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2271)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
at org.hibernate.type.descriptor.java.DataHelper.extractBytes(DataHelper.java:183)
at org.hibernate.type.descriptor.java.BlobTypeDescriptor.unwrap(BlobTypeDescriptor.java:121)
at org.hibernate.type.descriptor.java.BlobTypeDescriptor.unwrap(BlobTypeDescriptor.java:45)
at org.hibernate.type.descriptor.sql.BlobTypeDescriptor$4$1.doBind(BlobTypeDescriptor.java:105)
at org.hibernate.type.descriptor.sql.BasicBinder.bind(BasicBinder.java:92)
at org.hibernate.type.AbstractStandardBasicType.nullSafeSet(AbstractStandardBasicType.java:305)
at org.hibernate.type.AbstractStandardBasicType.nullSafeSet(AbstractStandardBasicType.java:300)
at org.hibernate.type.AbstractSingleColumnStandardBasicType.nullSafeSet(AbstractSingleColumnStandardBasicType.java:57)
at org.hibernate.persister.entity.AbstractEntityPersister.dehydrate(AbstractEntityPersister.java:2603)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2857)
at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:3301)
at org.hibernate.action.internal.EntityInsertAction.execute(EntityInsertAction.java:88)
at org.hibernate.engine.spi.ActionQueue.execute(ActionQueue.java:362)
at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:354)
at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:275)
at org.hibernate.event.internal.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:326)
at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:52)
at org.hibernate.internal.SessionImpl.flush(SessionImpl.java:1214)
at org.hibernate.internal.SessionImpl.managedFlush(SessionImpl.java:403)
at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.beforeTransactionCommit(JdbcTransaction.java:101)
at org.hibernate.engine.transaction.spi.AbstractTransactionImpl.commit(AbstractTransactionImpl.java:175)
at ru.swemel.msgcenter.domain.SimpleBeanTest.testBasicUsage(SimpleBeanTest.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)

使用 Hibernate 4.1.5.SP1.确切的问题是:如何在使用 Hibernate 将 blob 存储在数据库中时避免将流加载到内存中,而是使用直接流.我想避免关于为什么将视频存储在数据库列中而不是将其存储在某些内容存储库和链接中的话题.请将其视为与问题无关的模型.

Used Hibernate 4.1.5.SP1. The exact question is: how to avoid loading stream into memory when storing a blob in database using Hibernate, using direct streaming instead. I'd like to avoid off topics about why one stores video in column of database instead of storing it in some content repository and linking. Please, consider it a model what is irrelevant to the question.

似乎不同方言可能有某种功能,Hibernate 可能会尝试将所有内容加载到内存中,因为底层数据库不支持流式 blob 或类似的东西.如果是这样的话 - 我想看看不同方言在处理 blob 方面的某种比较表.

It seems that there might be some kind of capabilities on different dialects and Hibernate might try to load everything in memory, because underlying database doesn't support streaming blobs or something like that. If it's the case - i'd like to see some kind of comparative table between different dialects in aspect of handling blobs.

非常感谢您的帮助!

推荐答案

对于那些正在寻找相同事物的人.

For those looking for same thing.

我的错,代码按预期(流而不尝试复制到内存)为 PostgreSQL(可能还有很多其他的)工作.Hibernate 的内部工作取决于所选的方言.我首先使用的那个覆盖了直接使用流的方式,支持由 byte[] 支持的 BinaryStream.

My bad, the code works as supposed to (streams without trying to copy to memory) for PostgreSQL (and probably lots of others). The inner work of Hibernate depends on selected dialect. The one I used in the first place overrides direct use of streams in favor of BinaryStream backed by byte[].

也没有性能问题,因为它在 PostgreSQL 的情况下仅加载 OID(数字),并且在其他方​​言(包括 byte[] 实现)的情况下可能延迟加载数据.只是运行了一些脏测试,在 10000 次实体加载和不包含二进制数据字段中没有明显差异.

Also there are no problems with performance, since it loads only OID (number) in case of PostgreSQL, and probably lazy loads data in case of other dialects (including byte[] implementation). Just ran some dirty tests, no visible difference in 10000 loads of entity with and without binary data field.

不过,将数据存储在数据库中似乎比将其作为外部文件保存在磁盘上要慢.但是在备份、处理特定文件系统的限制或并发更新等时,它会为您省去很多麻烦.但这是一个题外话.

Storing data in database seems to be slower than just saving it on disk as external file though. But it saves you a lot of headache when backing up, or dealing with limitations of particular file system, or concurrent updates, etc. But it's an off-topic.

这篇关于如何使用 Hibernate 将数据流式传输到数据库 BLOB(字节 [] 中无内存存储)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆