Rails:针对集合而不是数据库表使用现有的模型验证规则 [英] Rails: use existing model validation rules against a collection instead of the database table

查看:95
本文介绍了Rails:针对集合而不是数据库表使用现有的模型验证规则的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

将Mongoid替换为ActiveRecord的第4条,(但是出于问题的考虑,这应该更改任何内容).

Rails 4, Mongoid instead of ActiveRecord (but this should change anything for the sake of the question).

假设我有一个具有某些验证规则的MyModel域类:

Let's say I have a MyModel domain class with some validation rules:

class MyModel
  include Mongoid::Document

  field :text, type: String
  field :type, type: String

  belongs_to :parent

  validates :text, presence: true
  validates :type, inclusion: %w(A B C)
  validates_uniqueness_of :text, scope: :parent # important validation rule for the purpose of the question
end

其中Parent是另一个域类:

class Parent
    include Mongoid::Document

    field :name, type: String

    has_many my_models
end

我还在数据库中的相关表中填充了一些有效数据.

Also I have the related tables in the database populated with some valid data.

现在,我想从CSV文件导入一些数据,这可能与数据库中的现有数据冲突.简单的事情是为CSV中的每一行创建一个MyModel实例,并验证它是否有效,然后将其保存到数据库中(或丢弃它).

Now, I want to import some data from an CSV file, which can conflict with the existing data in the database. The easy thing to do is to create an instance of MyModel for every row in the CSV and verify if it's valid, then save it to the database (or discard it).

类似这样的东西:

csv_rows.each |data| # simplified 
  my_model = MyModel.new(data) # data is the hash with the values taken from the CSV row

  if my_model.valid?
    my_model.save validate: false
  else
    # do something useful, but not interesting for the question's purpose
    # just know that I need to separate validation from saving
  end
end

现在,这对于有限数量的数据来说非常顺利.但是,当CSV包含成千上万的行时,这会变得很慢,因为(最坏的情况)每行都有一个写操作.

Now, this works pretty smoothly for a limited amount of data. But when the CSV contains hundreds of thousands of rows, this gets quite slow, because (worst case) there's a write operation for every row.

我想做的是存储有效项目列表,并在文件解析过程结束时将它们全部保存.所以,没什么复杂的:

What I'd like to do, is to store the list of valid items and save them all at the end of the file parsing process. So, nothing complicated:

valids = []
csv_rows.each |data|
  my_model = MyModel.new(data)

  if my_model.valid?  # THE INTERESTING LINE this "if" checks only against the database, what happens if it conflicts with some other my_models not saved yet?
    valids << my_model
  else
    # ...
  end
end

if valids.size > 0
  # bulk insert of all data
end

如果我可以确定CSV中的数据不包含重复的行或违反MyModel验证规则的数据,那将是完美的.

That would be perfect, if I could be sure that the data in the CSV does not contain duplicated rows or data that goes against the validation rules of MyModel.

我的问题是:我如何对照数据库和valids数组检查每一行,而不必重复定义在MyModel中的验证规则(避免重复这些验证规则)?

My question is: how can I check each row against the database AND the valids array, without having to repeat the validation rules defined into MyModel (avoiding to have them duplicated)?

我是否正在考虑使用其他(更有效的)方法?

Is there a different (more efficient) approach I'm not considering?

推荐答案

您可以做的是将其验证为模型,将属性保存在哈希中,推入valids数组,然后批量插入值usint mongodb的insert:

What you can do is validate as model, save the attributes in a hash, pushed to the valids array, then do a bulk insert of the values usint mongodb's insert:

valids = []
csv_rows.each |data|
  my_model = MyModel.new(data)

  if my_model.valid?
    valids << my_model.attributes
  end
end

MyModel.collection.insert(valids, continue_on_error: true)

但这不会阻止新的重复...为此,您可以使用哈希和复合键执行以下操作:

This won't however prevent NEW duplicates... for that you could do something like the following, using a hash and compound key:

valids = {}
csv_rows.each |data|
  my_model = MyModel.new(data)

  if my_model.valid?
    valids["#{my_model.text}_#{my_model.parent}"] = my_model.as_document
  end
end

那么DB Agnostic可以使用以下两种方法之一:

Then either of the following will work, DB Agnostic:

MyModel.create(valids.values)

或MongoDB'ish:

Or MongoDB'ish:

MyModel.collection.insert(valids.values, continue_on_error: true)

甚至更好

确保集合上有一个uniq索引:

OR EVEN BETTER

Ensure you have a uniq index on the collection:

class MyModel
  ...
  index({ text: 1, parent: 1 }, { unique: true, dropDups: true })
  ...
end

然后只需执行以下操作:

Then Just do the following:

MyModel.collection.insert(csv_rows, continue_on_error: true)

http://api.mongodb.org/ruby/current/Mongo/Collection.html#insert-instance_method http://mongoid.org/en/mongoid/docs/indexing.html

提示:我建议您是否期望成千上万的行以500个左右的批次进行此操作.

TIP: I recommend if you anticipate thousands of rows to do this in batches of 500 or so.

这篇关于Rails:针对集合而不是数据库表使用现有的模型验证规则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆