Hibernate,JDBC和Java在大中型结果集上的性能 [英] Hibernate, JDBC and Java performance on medium and big result set

查看:125
本文介绍了Hibernate,JDBC和Java在大中型结果集上的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题



我们正在优化我们的dataserver应用程序。
它通过mysql数据库存储股票和报价。
我们对取得的表演并不满意。

上下文



   - 数据库
- 表格股票:500行左右
- 表格报价:3 000 000至10 000 000行
- 一对多关联:一个股票拥有n个引号
- 每个请求获取大约1000个报价
- 报价表中有一个索引(stockId,日期)
- 无缓存,因为在生产中查询总是不同的
- Hibernate 3
- mysql 5.5
- Java 6
- JDBC mysql Connector 5.1.13
- c3p0 pooling

测试和结果



协议




  • 通过在mysql命令行bin中运行生成的sql查询来获取mysql服务器上的执行时间。
  • 服务器处于测试环境中:没有其他数据库读数,没有数据库着作

  • 我们为AAPL股票获取857份报价



案例1:Hibernate with asso ciation



使用857引号对象填充我们的股票对象(所有东西都正确地映射在hibernate.xml中)

  session.enableFilter(after)。setParameter(after,1322910573000L); 
股票股票=(股票)session.createCriteria(Stock.class)。
add(Restrictions.eq(stockId,stockId))。
setFetchMode(quotes,FetchMode.JOIN).uniqueResult();

SQL生成:

 选择this_.stockId AS stockId1_1_,
this_.symbol AS symbol1_1_,
this_.name AS name1_1_,
quotes2_.stockId AS stockId1_3_,
quotes2_.quoteId AS quoteId3_,
quotes2_.quoteId AS quoteId0_0_,
quotes2_.value AS value0_0_,
quotes2_.stockId AS stockId0_0_,
quotes2_.volume AS volume0_0_,
quotes2_.quality AS quality0_0_,
quotes2_.date AS date0_0_,
quotes2_.createdDate AS createdD7_0_0_,
quotes2_.fetcher AS fetcher0_0_
FROM stock this_
LEFT OUTER JOIN quote quotes2_ ON this__ .stockId = quotes2_.stockId
AND quotes2_.date> 1322910573000
WHERE this_.stockId ='AAPL'
ORDER BY quotes2_.date ASC

结果:


  • 在mysql服务器上执行时间:〜10 ms

  • Java中的执行时间:〜400ms



案例2:没有HQL无关联的Hibernate h2>

想要提高性能,我们使用了只提取引号对象的代码,并手动将它们添加到股票中(因此我们不会获取有关股票的重复信息为每一行)。我们使用createSQLQuery来最大限度地减少别名和HQL混乱的影响。

  String filter =AND q.date> 1322910573000; 
filter + =ORDER BY q.date DESC;
股票股票=新股票(stockId);
stock.addQuotes((ArrayList< Quote>)session.createSQLQuery(select * from quote q where stockId ='+ stockId +'+ filter).addEntity(Quote.class).list()) ;

SQL生成:

  SELECT * 
FROM quote q
WHERE stockId ='AAPL'
AND q.date> 1322910573000
ORDER BY q.date ASC



在MySQL服务器上:〜10 ms

  • Java中的执行时间:〜370ms
  • >

    案例3:没有Hibernate的JDBC



      String filter =AND q.date> ; 1322910573000\" ; 
    filter + =ORDER BY q.date DESC;
    股票股票=新股票(stockId);
    连接conn = SimpleJDBC.getConnection();
    Statement stmt = conn.createStatement();
    ResultSet rs = stmt.executeQuery(select * from quote q where stockId ='+ stockId +'+ filter); (rs.next())
    {
    stock.addQuote(新报价(rs.getInt(volume),rs.getLong(date),rs.getFloat value),rs.getByte(fetcher)));
    }
    stmt.close();
    conn.close();

    结果:


    • 在mysql服务器上执行时间:〜10 ms
    • 使用Java执行时间〜100ms $ b


    我们的理解




    • JDBC驱动程序对于所有情况

    • 在JDBC驱动中存在基本的时间成本

    • 在类似的sql查询中,Hibernate比纯JDBC代码在转换对象中花费更多时间

    • Hibernate createCriteria,createSQLQuery或createQuery在时间成本上相似

    • 在生产中,我们有很多并发写入,纯JDBC解决方案似乎是比hibernate慢一些(可能是因为我们的JDBC解决方案没有合并)

    • Mysql明智,服务器似乎表现非常好,时间成本非常可接受



    我们的问题




    • 有没有一种方法来优化性能e的JDBC驱动程序?

    • Hibernate是否有利于这种优化?

    • 转换结果集时有没有优化Hibernate性能的方法?
    • >
    • 我们是否因Java基本对象和内存管理而面临不可调的问题?

    • 我们是否错过了一个观点,我们是否愚蠢,这些都是徒劳的?

    • 我们是法国人吗?是的。



    您的帮助非常受欢迎。

    解决方案您可以使用简单查询做一个冒烟测试:
    $ b $ pre $ SELECT $ current_timestamp()

      SELECT 1 + 1 

    这会告诉你什么是实际的JDBC驱动程序开销。另外,两种测试是否都是在同一台机器上执行,并不清楚。


    有没有一种方法可以优化JDBC驱动程序的性能?


    在Java中运行相同的查询几千次。 JVM需要一些时间进行预热(class-loading,JIT)。另外,我假设 SimpleJDBC.getConnection()使用C3P0连接池 - 建立连接的成本非常高,因此最初的几次执行可能会很慢。



    另外还喜欢命名查询来进行即席查询或标准查询。


    Hibernate是否有利于这种优化?

    Hibernate是一个非常复杂的框架。正如您所看到的,与原始JDBC相比,它占用了总体执行时间的75%。如果你需要原始的ORM(没有延迟加载,脏检查,高级缓存),请考虑 mybatis 。或者甚至可以 JdbcTemplate RowMapper abstraction。


    <有没有一种方法可以在转换结果集时优化Hibernate的性能?


    并非如此。查看 第19章。提高性能 在Hibernate文档中。有很多反射发生在那里+班级一代。当你想从数据库中挤出每毫秒时,Hibernate再次成为最佳解决方案。



    然而,这是一个很好的选择由于广泛的缓存支持,您希望增加整体用户体验。查看 performance doc。它主要谈论缓存。有一级缓存,二级缓存,查询缓存......这是Hibernate可能实际上超越简单JDBC的地方 - 它可以以你无法想象的方式进行缓存。另一方面,糟糕的缓存配置会导致更慢的设置。

    查看:使用Hibernate + Spring进行缓存 - 一些问题!


    由于Java基本对象和内存管理,我们面临的某些内容不可调?

    JVM(特别是在 server 配置中)非常快。在堆上的对象创建速度与在例如堆中的速度一样快。 C,垃圾收集已经大大优化。我不认为运行普通JDBC的Java版本与更多本地连接相比要慢得多。这就是为什么我在你的基准测试中提出了一些改进的原因。


    我们是否错过了一个观点,我们是否愚蠢并且所有这些都是徒劳的? p>

    如果性能是您最大的问题,我相信JDBC是一个不错的选择。 Java已经在很多数据库重要的应用程序中成功使用过。


    Issue

    We are trying to optimize our dataserver application. It stores stocks and quotes over a mysql database. And we are not satisfied with the fetching performances.

    Context

    - database
        - table stock : around 500 lines
        - table quote : 3 000 000 to 10 000 000 lines
        - one-to-many association : one stock owns n quotes
        - fetching around 1000 quotes per request
        - there is an index on (stockId,date) in the quote table
        - no cache, because in production, querys are always different
    - Hibernate 3
    - mysql 5.5
    - Java 6
    - JDBC mysql Connector 5.1.13
    - c3p0 pooling
    

    Tests and results

    Protocol

    • Execution times on mysql server are obtained with running the generated sql queries in mysql command line bin.
    • The server is in a test context : no other DB readings, no DB writings
    • We fetch 857 quotes for the AAPL stock

    Case 1 : Hibernate with association

    This fills up our stock object with 857 quotes object (everything correctly mapped in hibernate.xml)

    session.enableFilter("after").setParameter("after", 1322910573000L);
    Stock stock = (Stock) session.createCriteria(Stock.class).
    add(Restrictions.eq("stockId", stockId)).
    setFetchMode("quotes", FetchMode.JOIN).uniqueResult();
    

    SQL generated :

    SELECT this_.stockId AS stockId1_1_,
           this_.symbol AS symbol1_1_,
           this_.name AS name1_1_,
           quotes2_.stockId AS stockId1_3_,
           quotes2_.quoteId AS quoteId3_,
           quotes2_.quoteId AS quoteId0_0_,
           quotes2_.value AS value0_0_,
           quotes2_.stockId AS stockId0_0_,
           quotes2_.volume AS volume0_0_,
           quotes2_.quality AS quality0_0_,
           quotes2_.date AS date0_0_,
           quotes2_.createdDate AS createdD7_0_0_,
           quotes2_.fetcher AS fetcher0_0_
    FROM stock this_
    LEFT OUTER JOIN quote quotes2_ ON this_.stockId=quotes2_.stockId
    AND quotes2_.date > 1322910573000
    WHERE this_.stockId='AAPL'
    ORDER BY quotes2_.date ASC
    

    Results :

    • Execution time on mysql server : ~10 ms
    • Execution time in Java : ~400ms

    Case 2 : Hibernate without association without HQL

    Thinking to increase performance, we've used that code that fetch only the quotes objects and we manually add them to a stock (so we don't fetch repeated infos about the stock for every line). We used createSQLQuery to minimize effects of aliases and HQL mess.

    String filter = " AND q.date>1322910573000";
    filter += " ORDER BY q.date DESC";
    Stock stock = new Stock(stockId);
    stock.addQuotes((ArrayList<Quote>) session.createSQLQuery("select * from quote q where stockId='" + stockId + "' " + filter).addEntity(Quote.class).list());
    

    SQL generated :

    SELECT *
    FROM quote q
    WHERE stockId='AAPL'
      AND q.date>1322910573000
    ORDER BY q.date ASC
    

    Results :

    • Execution time on mysql server : ~10 ms
    • Execution time in Java : ~370ms

    Case 3 : JDBC without Hibernate

    String filter = " AND q.date>1322910573000";
    filter += " ORDER BY q.date DESC";
    Stock stock = new Stock(stockId);
    Connection conn = SimpleJDBC.getConnection();
    Statement stmt = conn.createStatement();
    ResultSet rs = stmt.executeQuery("select * from quote q where stockId='" + stockId + "' " + filter);
    while(rs.next())
    {
        stock.addQuote(new Quote(rs.getInt("volume"), rs.getLong("date"), rs.getFloat("value"), rs.getByte("fetcher")));
    }
    stmt.close();
    conn.close();
    

    Results :

    • Execution time on mysql server : ~10 ms
    • Execution time in Java : ~100ms

    Our understandings

    • The JDBC driver is common to all the cases
    • There is a fundamental time cost in JDBC driving
    • With similar sql queries, Hibernate spends more time than pure JDBC code in converting result sets in objects
    • Hibernate createCriteria, createSQLQuery or createQuery are similar in time cost
    • In production, where we have lots of writing concurrently, pure JDBC solution seemed to be slower than the hibernate one (maybe because our JDBC solutions was not pooled)
    • Mysql wise, the server seems to behave very well, and the time cost is very acceptable

    Our questions

    • Is there a way to optimize the performance of JDBC driver ?
    • And will Hibernate benefit this optimization ?
    • Is there a way to optimize Hibernate performance when converting result sets ?
    • Are we facing something not tunable because of Java fundamental object and memory management ?
    • Are we missing a point, are we stupid and all of this is vain ?
    • Are we french ? Yes.

    Your help is very welcome.

    解决方案

    Can you do a smoke test with the simples query possible like:

    SELECT current_timestamp()
    

    or

    SELECT 1 + 1
    

    This will tell you what is the actual JDBC driver overhead. Also it is not clear whether both tests are performed from the same machine.

    Is there a way to optimize the performance of JDBC driver ?

    Run the same query several thousand times in Java. JVM needs some time to warm-up (class-loading, JIT). Also I assume SimpleJDBC.getConnection() uses C3P0 connection pooling - the cost of establishing a connection is pretty high so first few execution could be slow.

    Also prefer named queries to ad-hoc querying or criteria query.

    And will Hibernate benefit this optimization ?

    Hibernate is a very complex framework. As you can see it consumes 75% of the overall execution time compared to raw JDBC. If you need raw ORM (no lazy-loading, dirty checking, advanced caching), consider mybatis. Or maybe even JdbcTemplate with RowMapper abstraction.

    Is there a way to optimize Hibernate performance when converting result sets ?

    Not really. Check out the Chapter 19. Improving performance in Hibernate documentation. There is a lot of reflection happening out there + class generation. Once again, Hibernate might not be a best solution when you want to squeeze every millisecond from your database.

    However it is a good choice when you want to increase the overall user experience due to extensive caching support. Check out the performance doc again. It mostly talks about caching. There is a first level cache, second level cache, query cache... This is the place where Hibernate might actually outperform simple JDBC - it can cache a lot in a ways you could not even imagine. On the other hand - poor cache configuration would lead to even slower setup.

    Check out: Caching with Hibernate + Spring - some Questions!

    Are we facing something not tunable because of Java fundamental object and memory management ?

    JVM (especially in server configuration) is quite fast. Object creation on the heap is as fast as on the stack in e.g. C, garbage collection has been greatly optimized. I don't think the Java version running plain JDBC would be much slower compared to more native connection. That's why I suggested few improvements in your benchmark.

    Are we missing a point, are we stupid and all of this is vain ?

    I believe that JDBC is a good choice if performance is your biggest issue. Java has been used successfully in a lot of database-heavy applications.

    这篇关于Hibernate,JDBC和Java在大中型结果集上的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆