是否有使用游标或智能获取的Ruby ORM? [英] Are there any Ruby ORMs which use cursors or smart fetch?
问题描述
我正在寻找一个可替换ActiveRecord的Ruby ORM.我一直在看Sequel和DataMapper.它们看起来不错,但是它们似乎都没有做基础:不需要时不将所有内容加载到内存中.
I'm looking for a Ruby ORM to replace ActiveRecord. I've been looking at Sequel and DataMapper. They look pretty good however none of them seems to do the basic: not loading everything in memory when you don't need it.
我的意思是,我在具有很多行的表上对ActiveRecord和Sequel进行了以下(或等效操作)尝试:
I mean I've tried the following (or equivalent) on ActiveRecord and Sequel on table with lots of rows:
posts.each { |p| puts p }
他们两个都对记忆发疯.他们似乎将所有内容加载到内存中,而不是在需要时获取内容.我在ActiveRecord中使用了find_in_batches
,但这不是可接受的解决方案:
Both of them go crazy on memory. They seem to load everything in memory rather than fetching stuff when needed. I used the find_in_batches
in ActiveRecord, but it's not an acceptable solution:
- ActiveRecord是不能接受的解决方案,因为我们遇到了太多问题.
-
为什么我的代码应了解分页机制?我很高兴在某个地方配置页面的大小,仅此而已.使用
find_in_batches
,您需要执行以下操作:
- ActiveRecord is not an acceptable solution because we had too many problems with it.
Why should my code be aware of a paging mechanism? I'm happy to configure somewhere the size of the page but that's it. With
find_in_batches
you need to do something like:
post.find_in_batches {|批处理| batch.each {| p |放入p}}
post.find_in_batches { |batch| batch.each { |p| puts p } }
但这应该是透明的.
那么哪里有可靠的Ruby ORM,可以正确提取?
So is there somewhere a reliable Ruby ORM which does the fetch properly?
更新:
正如Sergio所述,在Rails 3中,您可以使用find_each
,这正是我想要的.但是,由于ActiveRecord不是一种选择,除非有人真的可以说服我使用它,否则问题是:
As Sergio mentioned, in Rails 3 you can use find_each
which exactly what I want. However as ActiveRecord is not an option, except if someone can really convince me to use it, the questions are:
- 哪些ORM支持等效的find_each?
- 如何做到?
- 为什么我们需要一个
find_each
,而find
应该这样做,不是吗?
- Which ORMs support the equivalent of find_each?
- How to do it?
- Why do we need a
find_each
, whilefind
should do it, shouldn't it?
推荐答案
Sequel的Dataset#each
一次生成单个行,但是大多数数据库驱动程序会首先将整个结果加载到内存中.
Sequel's Dataset#each
does yield individual rows at a time, but most database drivers will load the entire result in memory first.
如果您使用的是Sequel的Postgres适配器,则可以选择使用实际游标:
If you are using Sequel's Postgres adapter, you can choose to use real cursors:
posts.use_cursor.each{|p| puts p}
默认情况下,这一次一次获取1000行,但是您可以使用一个选项来指定每次提取游标时要抓取的行数:
This fetches 1000 rows at a time by default, but you can use an option to specify the amount of rows to grab per cursor fetch:
posts.use_cursor(:rows_per_fetch=>100).each{|p| puts p}
如果您不使用Sequel的Postgres适配器,则可以使用Sequel的分页扩展名:
If you aren't using Sequel's Postgres adapter, you can use Sequel's pagination extension:
Sequel.extension :pagination
posts.order(:id).each_page(1000){|ds| ds.each{|p| puts p}}
但是,与ActiveRecord的find_in_batches
/find_each
一样,它会进行单独的查询,因此,如果要检索的数据集同时进行修改,则需要小心.
However, like ActiveRecord's find_in_batches
/find_each
, this does separate queries, so you need to be careful if there are concurrent modifications to the dataset you are retrieving.
在Sequel中这不是默认值的原因可能与在ActiveRecord中不是默认值的原因相同,这是在一般情况下不是很好的默认值.只有具有较大结果集的查询才真正需要担心,大多数查询不会返回大型结果集.
The reason this isn't the default in Sequel is probably the same reason it isn't the default in ActiveRecord, which is that it isn't a good default in the general case. Only queries with large result sets really need to worry about it, and most queries don't return large result sets.
至少在Postgres适配器光标支持下,将其设置为模型的默认值相当容易:
At least with the Postgres adapter cursor support, it's fairly easy to make it the default for your model:
Post.dataset = Post.dataset.use_cursor
对于分页扩展名,您不能真正做到这一点,但是可以将其包装在一种使它几乎透明的方法中.
For the pagination extension, you can't really do that, but you can wrap it in a method that makes it mostly transparent.
这篇关于是否有使用游标或智能获取的Ruby ORM?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!