如何编写 Rake 任务以将数据导入 Rails 应用程序? [英] How to write Rake task to import data to Rails app?

查看:25
本文介绍了如何编写 Rake 任务以将数据导入 Rails 应用程序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目标:使用 CRON 任务(或其他预定事件)更新数据库,每晚从现有系统导出数据.

Goal: Using a CRON task (or other scheduled event) to update database with nightly export of data from an existing system.

所有数据都是在现有系统中创建/更新/删除的.该网站不直接与该系统集成,因此 rails 应用程序只需要反映出现在数据导出中的更新.

All data is created/updated/deleted in an existing system. The website does no directly integrate with this system, so the rails app simply needs to reflect the updates that appear in the data export.

我有一个 .txt 文件,其中包含约 5,000 个产品,如下所示:

I have a .txt file of ~5,000 products that looks like this:

"1234":"product name":"attr 1":"attr 2":"ABC Manufacturing":"2222"
"A134":"another product":"attr 1":"attr 2":"Foobar World":"2447"
...

所有值都是用双引号 (") 括起来的字符串,以冒号 (:) 分隔

All values are strings enclosed in double quotes (") that are separated by colons (:)

字段是:

  • id:唯一ID;字母数字
  • name:产品名称;任何字符
  • 属性列:字符串;任何字符(例如,大小、重量、颜色、尺寸)
  • vendor_name:字符串;任何字符
  • vendor_id:唯一的供应商ID;数字
  • id: unique id; alphanumeric
  • name: product name; any character
  • attribute columns: strings; any character (e.g., size, weight, color, dimension)
  • vendor_name: string; any character
  • vendor_id: unique vendor id; numeric

供应商信息在当前系统中未标准化.

Vendor information is not normalized in the current system.

这里的最佳做法是什么?是否可以在每个周期删除产品和供应商表并用新数据重写?还是只添加新行并更新现有行更好?

What are best practices here? Is it okay to delete the products and vendors tables and rewrite with the new data on every cycle? Or is it better to only add new rows and update existing ones?

注意事项:

  1. 此数据将用于生成订单,该订单将在夜间数据库导入中持续存在.OrderItems 将需要连接到数据文件中指定的产品 id,因此我们不能依赖自动递增的主键在每次导入时都相同;需要使用唯一的字母数字 ID 将 products 连接到 order_items.
  2. 理想情况下,我希望进口商规范化供应商数据
  3. 我不能使用普通的 SQL 语句,所以我想我需要编写一个 rake 任务才能使用 Product.create(...)Vendor.create(...) 样式语法.
  4. 这将在 EngineYard 上实施
  1. This data will be used to generate Orders that will persist through nightly database imports. OrderItems will need to be connected to the product ids that are specified in the data file, so we can't rely on an auto-incrementing primary key to be the same for each import; the unique alphanumeric id will need to be used to join products to order_items.
  2. Ideally, I'd like the importer to normalize the Vendor data
  3. I cannot use vanilla SQL statements, so I imagine I'll need to write a rake task in order to use Product.create(...) and Vendor.create(...) style syntax.
  4. This will be implemented on EngineYard

推荐答案

我不会在每个周期删除产品和供应商表.这是一个 Rails 应用程序吗?如果是这样,那么有一些非常好的 ActiveRecord 助手可以为您派上用场.

I wouldn't delete the products and vendors tables on every cycle. Is this a rails app? If so there are some really nice ActiveRecord helpers that would come in handy for you.

如果您有产品活动记录模型,您可以:

If you have a Product active record model, you can do:

p = Product.find_or_initialize_by_identifier(<id you get from file>)
p.name = <name from file>
p.size = <size from file>
etc...
p.save!

find_or_initialize 会根据你指定的id在数据库中查找产品,如果找不到,就会新建一个.这样做的真正方便之处在于,如果任何数据发生更改,ActiveRecord 只会保存到数据库中,并且它会相应地自动更新表中的任何时间戳字段 (updated_at).还有一件事,由于您将通过标识符(文件中的 id)查找记录,我会确保在数据库中的该字段上添加索引.

The find_or_initialize will lookup the product in the database by the id you specify, and if it can't find it, it will create a new one. The really handy thing about doing it this way, is that ActiveRecord will only save to the database if any of the data has changed, and it will automatically update any timestamp fields you have in the table (updated_at) accordingly. One more thing, since you would be looking up records by the identifier (id from the file), I would make sure to add an index on that field in the database.

为了完成一个 rake 任务,我会在你的 rails 应用程序的 lib/tasks 目录中添加一个 rake 文件.我们将其称为 data.rake.

To make a rake task to accomplish this, I would add a rake file to the lib/tasks directory of your rails app. We'll call it data.rake.

在 data.rake 中,它看起来像这样:

Inside data.rake, it would look something like this:

namespace :data do
  desc "import data from files to database"
  task :import => :environment do
    file = File.open(<file to import>)
    file.each do |line|
      attrs = line.split(":")
      p = Product.find_or_initialize_by_identifier(attrs[0])
      p.name = attrs[1]
      etc...
      p.save!
    end
  end
end

比调用 rake 任务,从命令行使用rake data:import".

Than to call the rake task, use "rake data:import" from the command line.

这篇关于如何编写 Rake 任务以将数据导入 Rails 应用程序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆