“连接"时,MySQL Connector/J是否会缓冲行?结果集? [英] Does MySQL Connector/J buffer rows when "streaming" a ResultSet?

查看:89
本文介绍了“连接"时,MySQL Connector/J是否会缓冲行?结果集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据我的阅读,我发现使用MySQL JDBC驱动程序在MySQL中流式传输ResultSet的方式是以下两个命令:

Based on my reading, I see that the way to stream a ResultSet in MySQL using the MySQL JDBC driver is these two commands:

stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);

我的问题是,专家是否可以澄清使用上述代码对ResultSet进行流式处理是否将一行返回给客户端,然后转到服务器以获取下一行,依此类推(效率极低),或者它是否足够聪明以至于像BufferedStreamReader?如果可以缓冲流,如何设置缓冲区大小?

My question is could an expert clarify if streaming the ResultSet using above code returns one row to client, then go to server to fetch next row and so on (terribly inefficient) or whether it is smart enough to do buffered streaming like a BufferedStreamReader? If it does buffered streaming, how to set the buffer size?

doc :

只读,只读结果集与访存的组合 Integer.MIN_VALUE的大小用作驱动程序进行流传输的信号 结果集逐行显示.之后,使用 语句将逐行检索.

The combination of a forward-only, read-only result set, with a fetch size of Integer.MIN_VALUE serves as a signal to the driver to stream result sets row-by-row. After this, any result sets created with the statement will be retrieved row-by-row.

这是否意味着如果我有1000万行,那么服务器将获得1000万次往返来获取这些行?这是非常低效的.我如何流式传输ResultSet但对其进行缓冲,这样我就不必进行太多往返了?

Does this mean that if I have 10M rows then there are 10M roundtrips to the server to get these rows? This is terribly inefficient. How can I stream the ResultSet but have it buffered so that I don't have to make so many roundtrips?

当fetchSize设置为Integer.MIN_VALUE时,MySQL似乎会自动进行一些缓冲.在我的测试中,使用setFetchSize(Integer.MIN_VALUE),我能够在不到20分钟的时间内读取超过4000万行.这相当于每秒大约30,000行.我不知道平均行数有多大,但很难想象每秒有30,000次往返.

It seems MySQL does some buffering automatically when fetchSize is set to Integer.MIN_VALUE. In my test I was able to read more than 40M rows in less than 20 minutes using setFetchSize(Integer.MIN_VALUE). This translates to about 30,000 rows per second. I don't know how big average row was but its hard to imagine 30,000 roundtrips per second.

还有一个单独的问题:如果结果集具有比fetchSize多的元素,MySQL会怎么做?例如,结果集有1000万行,而fetchSize设置为1000.那么会发生什么?

Also a separate question: what does MySQL do if the result set has more elements than the fetchSize? e.g., result set has 10M rows and fetchSize is set to 1000. What happens then?

推荐答案

当fetchSize设置为Integer.MIN_VALUE时,MySQL似乎会自动进行一些缓冲.

It seems MySQL does some buffering automatically when fetchSize is set to Integer.MIN_VALUE.

确实如此,至少有时如此.我使用Wireshark测试了MySQL Connector/J版本5.1.37的行为.对于桌子...

It does, at least sometimes. I tested the behaviour of MySQL Connector/J version 5.1.37 using Wireshark. For the table ...

CREATE TABLE lorem (
    id INT AUTO_INCREMENT PRIMARY KEY,
    tag VARCHAR(7),
    text1 VARCHAR(255),
    text2 VARCHAR(255)
    )

...带有测试数据...

... with test data ...

 id  tag      text1            text2
---  -------  ---------------  ---------------
  0  row_000  Lorem ipsum ...  Lorem ipsum ...
  1  row_001  Lorem ipsum ...  Lorem ipsum ...
  2  row_002  Lorem ipsum ...  Lorem ipsum ...
...
999  row_999  Lorem ipsum ...  Lorem ipsum ...

(where both `text1` and `text2` actually contain 255 characters in each row)

...和代码...

... and the code ...

try (Statement s = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY)) {
    s.setFetchSize(Integer.MIN_VALUE);
    String sql = "SELECT * FROM lorem ORDER BY id";
    try (ResultSet rs = s.executeQuery(sql)) {

... s.executeQuery(sql)后紧接–即rs.next()之前甚至称为– MySQL Connector/J从表中检索了前140行.

... immediately after the s.executeQuery(sql) – i.e., before rs.next() is even called – MySQL Connector/J had retrieved the first ~140 rows from the table.

实际上,仅查询tag

    String sql = "SELECT tag FROM lorem ORDER BY id";

MySQL Connector/J立即检索到所有1000行,如Wireshark网络框架列表所示:

MySQL Connector/J immediately retrieved all 1000 rows as shown by the Wireshark list of network frames:

将查询发送到服务器的第19帧看起来像这样:

Frame 19, which sent the query to the server, looked like this:

MySQL服务器以第20帧作为响应,该帧以...开头.

The MySQL server responded with frame 20, which started with ...

...,紧随其后的是第21帧,该帧以...

... and was immediately followed by frame 21, which began with ...

...,依此类推,直到服务器发送了第32帧,该帧以

... and so on until the server had sent frame 32, which ended with

由于唯一的区别是每一行返回的信息量,所以我们可以得出结论,MySQL Connector/J根据返回的每一行的最大长度和可用的可用内存量来决定合适的缓冲区大小. /p>

Since the only difference was the amount of information being returned for each row, we can conclude that MySQL Connector/J decides on an appropriate buffer size based on the maximum length of each returned row and the amount of free memory available.

如果结果集包含的元素多于fetchSize,MySQL会怎么做?例如,结果集有1000万行,而fetchSize设置为1000.那么会发生什么?

what does MySQL do if the result set has more elements than the fetchSize? e.g., result set has 10M rows and fetchSize is set to 1000. What happens then?

MySQL Connector/J最初检索第一个fetchSize行组,然后随着rs.next()在它们之间移动,它将最终检索下一行行.即使对于setFetchSize(1)也是这样,顺便说一句,这实际上是一次 一次只获得一行的方法.

MySQL Connector/J initially retrieves the first fetchSize group of rows, then as rs.next() moves through them it will eventually retrieve the next group of rows. That is true even for setFetchSize(1) which, incidentally, is the way to really get only one row at a time.

(请注意,n> 0的setFetchSize(n)在连接URL中需要useCursorFetch=true.对于setFetchSize(Integer.MIN_VALUE),显然不是必需的.)

(Note that setFetchSize(n) for n>0 requires useCursorFetch=true in the connection URL. That is apparently not required for setFetchSize(Integer.MIN_VALUE).)

这篇关于“连接"时,MySQL Connector/J是否会缓冲行?结果集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆