用Hibernate Search索引巨大的表 [英] Indexing huge table with Hibernate Search

查看:585
本文介绍了用Hibernate Search索引巨大的表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图将Hibernate Search添加到我的项目中以提高搜索性能,但我在索引巨大表格时遇到了问题。
我已经添加了Hibernate Search依赖项,并且我有一个简单的servlet来触发索引过程:

I'm trying to add Hibernate Search to my project to improve search performance, but I have problem with Indexing huge tables. I've added Hibernate Search dependency and I have simple servlet where I trigger indexing process:

    FullTextEntityManager ftem = Search.getFullTextEntityManager(em);
    try {
        ftem
        .createIndexer(MyEntity.class)
        .batchSizeToLoadObjects(25)
        .cacheMode(CacheMode.NORMAL)
        .threadsToLoadObjects(5)
        .startAndWait();
    } catch (InterruptedException e) {
        e.printStackTrace();
    }

以及在我的persistance.xml中:

and in my persistance.xml:

    <property name="hibernate.show_sql" value="false" />
    <property name="hibernate.dialect" value="org.hibernate.dialect.MySQL5InnoDBDialect" />
    <property name="hibernate.archive.autodetection" value="class" />
    <property name="hibernate.search.default.directory_provider" value="filesystem" />
    <property name="hibernate.search.default.indexBase" value="/var/lucene/indexes" />

问题是MyEntity表有大约25百万行,而在大约30秒后,出现内存不足错误消息:

The problem is that MyEntity table has around 25 milion rows and after about 30seconds I get out of memory error messages:

2015-07-28 21:16:50,168 INFO  [stdout] (default task-60) Building index

2015-07-28 21:16:55,180 INFO  [org.hibernate.search.impl.SimpleIndexingProgressMonitor] (Hibernate Search: identifierloader-1) HSEARCH000027: Going to reindex 22593085 entities
2015-07-28 21:19:47,186 ERROR [org.jboss.as.controller.management-operation] (DeploymentScanner-threads - 2) WFLYCTL0013: Operation ("read-children-resources") failed - address: ([]): java.lang.OutOfMemoryError: GC overhead limit exceeded

2015-07-28 21:19:58,506 WARN  [org.jboss.jca.core.connectionmanager.listener.TxConnectionListener] (Hibernate Search: identifierloader-1) IJ000305: Connection error occured: org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@15a020a3[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@446189fe connection handles=1 lastReturned=1438110947536 lastValidated=1438108373971 lastCheckedOut=1438111010224 trackByTx=true pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@3fb3ab95 mcp=SemaphoreArrayListManagedConnectionPool@496e4f29[pool=MyProjectApiDS] xaResource=LocalXAResourceImpl@4f676ce7[connectionListener=15a020a3 connectionManager=798378ab warned=false currentXid=< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffc0a8010b:537a5b28:55b7cad0:167, node_name=1, branch_uid=0:ffffc0a8010b:537a5b28:55b7cad0:169, subordinatenodename=null, eis_name=java:/MyProjectApiDS > productName=MySQL productVersion=5.6.25-log jndiName=java:/MyProjectApiDS] txSync=null]: javax.resource.spi.ResourceAdapterInternalException: Unexpected error
    at org.jboss.jca.adapters.jdbc.BaseWrapperManagedConnection.broadcastConnectionError(BaseWrapperManagedConnection.java:699)
    at org.jboss.jca.adapters.jdbc.BaseWrapperManagedConnection.connectionError(BaseWrapperManagedConnection.java:665)
    at org.jboss.jca.adapters.jdbc.WrappedConnection.checkException(WrappedConnection.java:1669)
    at org.jboss.jca.adapters.jdbc.WrappedStatement.checkException(WrappedStatement.java:1267)
    at org.jboss.jca.adapters.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:467)
    at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:82)
    at org.hibernate.loader.Loader.getResultSet(Loader.java:2066)
    at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1863)
    at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1839)
    at org.hibernate.loader.Loader.scroll(Loader.java:2627)
    at org.hibernate.loader.criteria.CriteriaLoader.scroll(CriteriaLoader.java:121)
    at org.hibernate.internal.StatelessSessionImpl.scroll(StatelessSessionImpl.java:682)
    at org.hibernate.internal.CriteriaImpl.scroll(CriteriaImpl.java:394)
    at org.hibernate.search.batchindexing.impl.IdentifierProducer.loadAllIdentifiers(IdentifierProducer.java:146)
    at org.hibernate.search.batchindexing.impl.IdentifierProducer.inTransactionWrapper(IdentifierProducer.java:111)
    at org.hibernate.search.batchindexing.impl.IdentifierProducer.run(IdentifierProducer.java:95)
    at org.hibernate.search.batchindexing.impl.OptionallyWrapInJTATransaction.runWithErrorHandler(OptionallyWrapInJTATransaction.java:97)
    at org.hibernate.search.batchindexing.impl.ErrorHandledRunnable.run(ErrorHandledRunnable.java:49)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded

2015-07-28 21:19:58,514 ERROR [org.jboss.remoting.remote.connection] (XNIO-1 I/O-1) JBREM000200: Remote connection failed: java.io.IOException: Istniejące połączenie zostało gwałtownie zamknięte przez zdalnego hosta
2015-07-28 21:19:58,531 INFO  [org.jboss.as.server.deployment.scanner] (DeploymentScanner-threads - 2) WFLYDS0019: Deployment mysql-connector-java-5.1.34-bin.jar was previously deployed by this scanner but has been removed from the server deployment list by another management tool. Marker file C:\servers\wildfly-9.0.0.Final\standalone\deployments\mysql-connector-java-5.1.34-bin.jar.undeployed is being added to record this fact.
2015-07-28 21:19:58,620 WARN  [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (Hibernate Search: identifierloader-1) SQL Error: 0, SQLState: null
2015-07-28 21:19:58,621 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (Hibernate Search: identifierloader-1) Error
2015-07-28 21:19:58,622 ERROR [org.hibernate.search.exception.impl.LogErrorHandler] (Hibernate Search: identifierloader-1) HSEARCH000058: HSEARCH000116: Unexpected error during MassIndexer operation: org.hibernate.exception.GenericJDBCException: could not extract ResultSet
    at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:54)
    at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:126)
    at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:112)
    at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:91)
    at org.hibernate.loader.Loader.getResultSet(Loader.java:2066)
    at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1863)
    at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1839)
    at org.hibernate.loader.Loader.scroll(Loader.java:2627)
    at org.hibernate.loader.criteria.CriteriaLoader.scroll(CriteriaLoader.java:121)
    at org.hibernate.internal.StatelessSessionImpl.scroll(StatelessSessionImpl.java:682)
    at org.hibernate.internal.CriteriaImpl.scroll(CriteriaImpl.java:394)
    at org.hibernate.search.batchindexing.impl.IdentifierProducer.loadAllIdentifiers(IdentifierProducer.java:146)
    at org.hibernate.search.batchindexing.impl.IdentifierProducer.inTransactionWrapper(IdentifierProducer.java:111)
    at org.hibernate.search.batchindexing.impl.IdentifierProducer.run(IdentifierProducer.java:95)
    at org.hibernate.search.batchindexing.impl.OptionallyWrapInJTATransaction.runWithErrorHandler(OptionallyWrapInJTATransaction.java:97)
    at org.hibernate.search.batchindexing.impl.ErrorHandledRunnable.run(ErrorHandledRunnable.java:49)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLException: Error
    at org.jboss.jca.adapters.jdbc.WrappedConnection.checkException(WrappedConnection.java:1677)
    at org.jboss.jca.adapters.jdbc.WrappedStatement.checkException(WrappedStatement.java:1267)
    at org.jboss.jca.adapters.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:467)
    at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:82)
    ... 15 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded

2015-07-28 21:19:58,667 INFO  [org.hibernate.search.impl.SimpleIndexingProgressMonitor] (default task-60) HSEARCH000028: Reindexed 22593085 entities
2015-07-28 21:19:58,673 WARN  [com.arjuna.ats.jta] (Hibernate Search: identifierloader-1) ARJUNA016031: XAOnePhaseResource.rollback for < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffc0a8010b:537a5b28:55b7cad0:167, node_name=1, branch_uid=0:ffffc0a8010b:537a5b28:55b7cad0:169, subordinatenodename=null, eis_name=java:/MyProjectApiDS > failed with exception: org.jboss.jca.core.spi.transaction.local.LocalXAException: IJ001160: Could not rollback local transaction
    at org.jboss.jca.core.tx.jbossts.LocalXAResourceImpl.rollback(LocalXAResourceImpl.java:253)
    at com.arjuna.ats.internal.jta.resources.arjunacore.XAOnePhaseResource.rollback(XAOnePhaseResource.java:205)
    at com.arjuna.ats.internal.arjuna.abstractrecords.LastResourceRecord.topLevelAbort(LastResourceRecord.java:126)
    at com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:2993)
    at com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:2972)
    at com.arjuna.ats.arjuna.coordinator.BasicAction.Abort(BasicAction.java:1675)
    at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.cancel(TwoPhaseCoordinator.java:127)
    at com.arjuna.ats.arjuna.AtomicAction.abort(AtomicAction.java:186)
    at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.rollbackAndDisassociate(TransactionImple.java:1282)
    at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.rollback(BaseTransaction.java:143)
    at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.rollback(BaseTransactionManagerDelegate.java:114)
    at org.hibernate.search.batchindexing.impl.OptionallyWrapInJTATransaction.cleanUpOnError(OptionallyWrapInJTATransaction.java:123)
    at org.hibernate.search.batchindexing.impl.ErrorHandledRunnable.run(ErrorHandledRunnable.java:54)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.jboss.jca.core.spi.transaction.local.LocalResourceException: No operations allowed after connection closed.
    at org.jboss.jca.adapters.jdbc.local.LocalManagedConnection.rollback(LocalManagedConnection.java:139)
    at org.jboss.jca.core.tx.jbossts.LocalXAResourceImpl.rollback(LocalXAResourceImpl.java:248)
    ... 15 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after connection closed.
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
    at com.mysql.jdbc.Util.handleNewInstance(Util.java:377)
    at com.mysql.jdbc.Util.getInstance(Util.java:360)
    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956)
    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:935)
    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:924)
    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:870)
    at com.mysql.jdbc.ConnectionImpl.throwConnectionClosedException(ConnectionImpl.java:1232)
    at com.mysql.jdbc.ConnectionImpl.checkClosed(ConnectionImpl.java:1225)
    at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4568)
    at org.jboss.jca.adapters.jdbc.local.LocalManagedConnection.rollback(LocalManagedConnection.java:132)
    ... 16 more

所以问题是,如何自动索引巨大的表?

So the question is, how to index huge tables automatically?

推荐答案

您正在使用哪个版本的Hibernate Search。如果您使用的是最新的5.4版本,您实际上可以为索引配置事务超时。像这样:

Which version of Hibernate Search are you using. If you are using the latest 5.4 release, you can actually configure the transaction timeout just for the indexing. Something like this:

fullTextSession
 .createIndexer( User.class )
 .batchSizeToLoadObjects( 25 )
 .cacheMode( CacheMode.NORMAL )
 .threadsToLoadObjects( 12 )
 .idFetchSize( 150 )
 .transactionTimeout( 1800 )
 .startAndWait();

如果可以的话,我会建议使用最新版本。

If you can, I would recommend using the latest version.

这篇关于用Hibernate Search索引巨大的表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆