在grails中插入10,000,000多行 [英] Insert 10,000,000+ rows in grails

查看:187
本文介绍了在grails中插入10,000,000多行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近阅读了很多关于从巨量数据中填充grails表的文章,但似乎已经达到了上限。我的代码如下:

  class LoadingService {
def sessionFactory
def dataSource
def propertyInstanceMap = org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP
$ b $ def def insertFile(fileName){
InputStream inputFile = getClass()。classLoader.getResourceAsStream(fileName)
def pCounter = 1
def mCounter = 1
Sql sql = new Sql(dataSource)
inputFile.splitEachLine(/ \ n | \r |,/){line - >
line.each {value - > (value.equalsIgnoreCase('0'){
pCounter ++
return
}
sql.executeInsert(Insert into Patient_MRNA(patient_id,mrna_id,value)values ($ {pCounter},$ {mCounter},$ {value.toFloat()}))
pCounter ++
}
if(mCounter%100 == 0){
cleanUpGorm()
}
pCounter = 1
mCounter ++
}
}

def cleanUpGorm(){
session.currentSession .clear()
propertyInstanceMap.get()。clear()
}
}

我禁用了二级缓存,我使用了分配的ID,并且我明确地通过一个域来处理这种多对多关系,而不是hasMany和belongsTo。



在应用这些方法后,我的速度有了惊人的提高,但过了一段时间后插入缓慢几乎停下来,相比之下,开始时每分钟约为623,000。


是否还有其他内存泄漏,我应该知道或者只是在grails中的批量插入方面达到了上限?

为了清楚起见,大约需要2分钟才能插入120万行,但之后他们开始放慢速度。

解决方案

尝试批量插入,效率更高

  def updateCounts = sql.withBatch {stmt  - > 
stmt.addBatch(insert into TABLENAME ...)
stmt.addBatch(insert into TABLENAME ...)
stmt.addBatch(insert into TABLENAME ... )
...
}


I've read a lot of articles recently about populating a grails table from huge data, but seem to have hit a ceiling. My code is as follows:

class LoadingService {
    def sessionFactory
    def dataSource
    def propertyInstanceMap = org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP

    def insertFile(fileName) {
        InputStream inputFile = getClass().classLoader.getResourceAsStream(fileName)
        def pCounter = 1
        def mCounter = 1
        Sql sql = new Sql(dataSource)
        inputFile.splitEachLine(/\n|\r|,/) { line -> 
            line.each { value ->
                if(value.equalsIgnoreCase('0') { 
                    pCounter++
                    return
                }
                sql.executeInsert("insert into Patient_MRNA (patient_id, mrna_id, value) values (${pCounter}, ${mCounter}, ${value.toFloat()})")
                pCounter++
            }
            if(mCounter % 100 == 0) {
                cleanUpGorm()
            }
            pCounter = 1
            mCounter++
        }
    }

    def cleanUpGorm() {
        session.currentSession.clear()
        propertyInstanceMap.get().clear()
    }
}

I have disabled secondary cache, I'm using assigned ids, and I am explicitly handling this many to many relationship through a domain, not the hasMany and belongsTo.

My speed has increased monumentally after applying these methods, but after a while the inserts slow down to the point of almost stopping compared to about 623,000 per minute at the beginning.

Is there some other memory leak that I should be aware of or have I just hit the ceiling in terms of batch inserts in grails?

To be clear it takes about 2 minutes to insert 1.2 million rows, but then they start to slow down.

解决方案

Try doing batch inserts, it's much more efficient

def updateCounts = sql.withBatch { stmt ->
     stmt.addBatch("insert into TABLENAME ...")
     stmt.addBatch("insert into TABLENAME ...")
     stmt.addBatch("insert into TABLENAME ...")
     ...
 }

这篇关于在grails中插入10,000,000多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆