Hibernate,JDBC和Java在大中型结果集上的性能 [英] Hibernate, JDBC and Java performance on medium and big result set
问题描述
问题
我们正在优化我们的dataserver应用程序。
它通过mysql数据库存储股票和报价。
我们对取得的表演并不满意。
上下文
- 数据库
- 表格股票:500行左右
- 表格报价:3 000 000至10 000 000行
- 一对多关联:一个股票拥有n个引号
- 每个请求获取大约1000个报价
- 报价表中有一个索引(stockId,日期)
- 无缓存,因为在生产中查询总是不同的
- Hibernate 3
- mysql 5.5
- Java 6
- JDBC mysql Connector 5.1.13
- c3p0 pooling
测试和结果
协议
- 通过在mysql命令行bin中运行生成的sql查询来获取mysql服务器上的执行时间。
- 服务器处于测试环境中:没有其他数据库读数,没有数据库着作
- 我们为AAPL股票获取857份报价
案例1:Hibernate with asso ciation
使用857引号对象填充我们的股票对象(所有东西都正确地映射在hibernate.xml中)
session.enableFilter(after)。setParameter(after,1322910573000L);
股票股票=(股票)session.createCriteria(Stock.class)。
add(Restrictions.eq(stockId,stockId))。
setFetchMode(quotes,FetchMode.JOIN).uniqueResult();
SQL生成:
选择this_.stockId AS stockId1_1_,
this_.symbol AS symbol1_1_,
this_.name AS name1_1_,
quotes2_.stockId AS stockId1_3_,
quotes2_.quoteId AS quoteId3_,
quotes2_.quoteId AS quoteId0_0_,
quotes2_.value AS value0_0_,
quotes2_.stockId AS stockId0_0_,
quotes2_.volume AS volume0_0_,
quotes2_.quality AS quality0_0_,
quotes2_.date AS date0_0_,
quotes2_.createdDate AS createdD7_0_0_,
quotes2_.fetcher AS fetcher0_0_
FROM stock this_
LEFT OUTER JOIN quote quotes2_ ON this__ .stockId = quotes2_.stockId
AND quotes2_.date> 1322910573000
WHERE this_.stockId ='AAPL'
ORDER BY quotes2_.date ASC
结果:
- 在mysql服务器上执行时间:〜10 ms
- Java中的执行时间:〜400ms
案例2:没有HQL无关联的Hibernate h2>
想要提高性能,我们使用了只提取引号对象的代码,并手动将它们添加到股票中(因此我们不会获取有关股票的重复信息为每一行)。我们使用createSQLQuery来最大限度地减少别名和HQL混乱的影响。
String filter =AND q.date> 1322910573000;
filter + =ORDER BY q.date DESC;
股票股票=新股票(stockId);
stock.addQuotes((ArrayList< Quote>)session.createSQLQuery(select * from quote q where stockId ='+ stockId +'+ filter).addEntity(Quote.class).list()) ;
SQL生成:
SELECT *
FROM quote q
WHERE stockId ='AAPL'
AND q.date> 1322910573000
ORDER BY q.date ASC
在MySQL服务器上:〜10 ms
Java中的执行时间:〜370ms >
案例3:没有Hibernate的JDBC
String filter =AND q.date> ; 1322910573000\" ;
filter + =ORDER BY q.date DESC;
股票股票=新股票(stockId);
连接conn = SimpleJDBC.getConnection();
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery(select * from quote q where stockId ='+ stockId +'+ filter); (rs.next())
{
stock.addQuote(新报价(rs.getInt(volume),rs.getLong(date),rs.getFloat value),rs.getByte(fetcher)));
}
stmt.close();
conn.close();
结果:
- 在mysql服务器上执行时间:〜10 ms
使用Java执行时间〜100ms $ b
我们的理解
- JDBC驱动程序对于所有情况
- 在JDBC驱动中存在基本的时间成本
- 在类似的sql查询中,Hibernate比纯JDBC代码在转换对象中花费更多时间
- Hibernate createCriteria,createSQLQuery或createQuery在时间成本上相似
- 在生产中,我们有很多并发写入,纯JDBC解决方案似乎是比hibernate慢一些(可能是因为我们的JDBC解决方案没有合并)
- Mysql明智,服务器似乎表现非常好,时间成本非常可接受
我们的问题
- 有没有一种方法来优化性能e的JDBC驱动程序?
- Hibernate是否有利于这种优化?
- 转换结果集时有没有优化Hibernate性能的方法?
>
- 我们是否因Java基本对象和内存管理而面临不可调的问题?
- 我们是否错过了一个观点,我们是否愚蠢,这些都是徒劳的?
- 我们是法国人吗?是的。
您的帮助非常受欢迎。
解决方案您可以使用简单查询做一个冒烟测试:
$ b $ pre $ SELECT $ current_timestamp()
或
SELECT 1 + 1
这会告诉你什么是实际的JDBC驱动程序开销。另外,两种测试是否都是在同一台机器上执行,并不清楚。
有没有一种方法可以优化JDBC驱动程序的性能?
在Java中运行相同的查询几千次。 JVM需要一些时间进行预热(class-loading,JIT)。另外,我假设 SimpleJDBC.getConnection()
使用C3P0连接池 - 建立连接的成本非常高,因此最初的几次执行可能会很慢。
另外还喜欢命名查询来进行即席查询或标准查询。
Hibernate是否有利于这种优化?
Hibernate是一个非常复杂的框架。正如您所看到的,与原始JDBC相比,它占用了总体执行时间的75%。如果你需要原始的ORM(没有延迟加载,脏检查,高级缓存),请考虑 mybatis 。或者甚至可以 JdbcTemplate
与 RowMapper
abstraction。
<有没有一种方法可以在转换结果集时优化Hibernate的性能?
并非如此。查看 第19章。提高性能 在Hibernate文档中。有很多反射发生在那里+班级一代。当你想从数据库中挤出每毫秒时,Hibernate再次成为最佳解决方案。
然而,这是一个很好的选择由于广泛的缓存支持,您希望增加整体用户体验。查看 performance doc。它主要谈论缓存。有一级缓存,二级缓存,查询缓存......这是Hibernate可能实际上超越简单JDBC的地方 - 它可以以你无法想象的方式进行缓存。另一方面,糟糕的缓存配置会导致更慢的设置。
查看:使用Hibernate + Spring进行缓存 - 一些问题!
由于Java基本对象和内存管理,我们面临的某些内容不可调?
JVM(特别是在 server 配置中)非常快。在堆上的对象创建速度与在例如堆中的速度一样快。 C,垃圾收集已经大大优化。我不认为运行普通JDBC的Java版本与更多本地连接相比要慢得多。这就是为什么我在你的基准测试中提出了一些改进的原因。
我们是否错过了一个观点,我们是否愚蠢并且所有这些都是徒劳的? p>
如果性能是您最大的问题,我相信JDBC是一个不错的选择。 Java已经在很多数据库重要的应用程序中成功使用过。
Issue
We are trying to optimize our dataserver application.
It stores stocks and quotes over a mysql database.
And we are not satisfied with the fetching performances.
Context
- database
- table stock : around 500 lines
- table quote : 3 000 000 to 10 000 000 lines
- one-to-many association : one stock owns n quotes
- fetching around 1000 quotes per request
- there is an index on (stockId,date) in the quote table
- no cache, because in production, querys are always different
- Hibernate 3
- mysql 5.5
- Java 6
- JDBC mysql Connector 5.1.13
- c3p0 pooling
Tests and results
Protocol
- Execution times on mysql server are obtained with running the generated sql queries in mysql command line bin.
- The server is in a test context : no other DB readings, no DB writings
- We fetch 857 quotes for the AAPL stock
Case 1 : Hibernate with association
This fills up our stock object with 857 quotes object (everything correctly mapped in hibernate.xml)
session.enableFilter("after").setParameter("after", 1322910573000L);
Stock stock = (Stock) session.createCriteria(Stock.class).
add(Restrictions.eq("stockId", stockId)).
setFetchMode("quotes", FetchMode.JOIN).uniqueResult();
SQL generated :
SELECT this_.stockId AS stockId1_1_,
this_.symbol AS symbol1_1_,
this_.name AS name1_1_,
quotes2_.stockId AS stockId1_3_,
quotes2_.quoteId AS quoteId3_,
quotes2_.quoteId AS quoteId0_0_,
quotes2_.value AS value0_0_,
quotes2_.stockId AS stockId0_0_,
quotes2_.volume AS volume0_0_,
quotes2_.quality AS quality0_0_,
quotes2_.date AS date0_0_,
quotes2_.createdDate AS createdD7_0_0_,
quotes2_.fetcher AS fetcher0_0_
FROM stock this_
LEFT OUTER JOIN quote quotes2_ ON this_.stockId=quotes2_.stockId
AND quotes2_.date > 1322910573000
WHERE this_.stockId='AAPL'
ORDER BY quotes2_.date ASC
Results :
- Execution time on mysql server : ~10 ms
- Execution time in Java : ~400ms
Case 2 : Hibernate without association without HQL
Thinking to increase performance, we've used that code that fetch only the quotes objects and we manually add them to a stock (so we don't fetch repeated infos about the stock for every line). We used createSQLQuery to minimize effects of aliases and HQL mess.
String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
stock.addQuotes((ArrayList<Quote>) session.createSQLQuery("select * from quote q where stockId='" + stockId + "' " + filter).addEntity(Quote.class).list());
SQL generated :
SELECT *
FROM quote q
WHERE stockId='AAPL'
AND q.date>1322910573000
ORDER BY q.date ASC
Results :
- Execution time on mysql server : ~10 ms
- Execution time in Java : ~370ms
Case 3 : JDBC without Hibernate
String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
Connection conn = SimpleJDBC.getConnection();
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("select * from quote q where stockId='" + stockId + "' " + filter);
while(rs.next())
{
stock.addQuote(new Quote(rs.getInt("volume"), rs.getLong("date"), rs.getFloat("value"), rs.getByte("fetcher")));
}
stmt.close();
conn.close();
Results :
- Execution time on mysql server : ~10 ms
- Execution time in Java : ~100ms
Our understandings
- The JDBC driver is common to all the cases
- There is a fundamental time cost in JDBC driving
- With similar sql queries, Hibernate spends more time than pure JDBC code in converting result sets in objects
- Hibernate createCriteria, createSQLQuery or createQuery are similar in time cost
- In production, where we have lots of writing concurrently, pure JDBC solution seemed to be slower than the hibernate one (maybe because our JDBC solutions was not pooled)
- Mysql wise, the server seems to behave very well, and the time cost is very acceptable
Our questions
- Is there a way to optimize the performance of JDBC driver ?
- And will Hibernate benefit this optimization ?
- Is there a way to optimize Hibernate performance when converting result sets ?
- Are we facing something not tunable because of Java fundamental object and memory management ?
- Are we missing a point, are we stupid and all of this is vain ?
- Are we french ? Yes.
Your help is very welcome.
解决方案 Can you do a smoke test with the simples query possible like:
SELECT current_timestamp()
or
SELECT 1 + 1
This will tell you what is the actual JDBC driver overhead. Also it is not clear whether both tests are performed from the same machine.
Is there a way to optimize the performance of JDBC driver ?
Run the same query several thousand times in Java. JVM needs some time to warm-up (class-loading, JIT). Also I assume SimpleJDBC.getConnection()
uses C3P0 connection pooling - the cost of establishing a connection is pretty high so first few execution could be slow.
Also prefer named queries to ad-hoc querying or criteria query.
And will Hibernate benefit this optimization ?
Hibernate is a very complex framework. As you can see it consumes 75% of the overall execution time compared to raw JDBC. If you need raw ORM (no lazy-loading, dirty checking, advanced caching), consider mybatis. Or maybe even JdbcTemplate
with RowMapper
abstraction.
Is there a way to optimize Hibernate performance when converting result sets ?
Not really. Check out the Chapter 19. Improving performance in Hibernate documentation. There is a lot of reflection happening out there + class generation. Once again, Hibernate might not be a best solution when you want to squeeze every millisecond from your database.
However it is a good choice when you want to increase the overall user experience due to extensive caching support. Check out the performance doc again. It mostly talks about caching. There is a first level cache, second level cache, query cache... This is the place where Hibernate might actually outperform simple JDBC - it can cache a lot in a ways you could not even imagine. On the other hand - poor cache configuration would lead to even slower setup.
Check out: Caching with Hibernate + Spring - some Questions!
Are we facing something not tunable because of Java fundamental object and memory management ?
JVM (especially in server configuration) is quite fast. Object creation on the heap is as fast as on the stack in e.g. C, garbage collection has been greatly optimized. I don't think the Java version running plain JDBC would be much slower compared to more native connection. That's why I suggested few improvements in your benchmark.
Are we missing a point, are we stupid and all of this is vain ?
I believe that JDBC is a good choice if performance is your biggest issue. Java has been used successfully in a lot of database-heavy applications.
这篇关于Hibernate,JDBC和Java在大中型结果集上的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文