从CSV导入小块记录(轨道上的红宝石) [英] Import records from CSV in small chunks (ruby on rails)

查看:140
本文介绍了从CSV导入小块记录(轨道上的红宝石)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要导入一个大的CSV文件,细分为小块,每隔X小时输入一次。



我做了下面的rake任务

  task:import_reviews => :environment do 
require'csv'
CSV.foreach('reviews.csv',:headers => true)do | row |
Review.create(row.to_hash)
end
end

使用heroku调度程序我可以让这个任务每天运行,但我想分成几个块,例如每天100条记录:



这意味着我需要跟踪导入的最后一行,并从该行开始+ = 1下一次我让耙子任务运行,我该如何实现这一点?



谢谢

解决方案

将CSV的其余部分读入数组,并在CSV.foreach循环之外写入相同的CSV文件,以便每次都变小。我想我不必在代码中给出这个,但如果有必要的话,我可以评论我,我会做。



如果你想保持整个CSV,添加一个字段pocessed到CSV并填写1如果阅读,下一次过滤这些了。<​​/ p>

编辑:这未经测试,肯定可能会更好但只是为了表明我的意思是什么

  require'csv'
index = 1
csv_out = CSV: :Writer.generate(File.open('new.csv','wb'))
CSV.foreach('reviews.csv',:headers => true)do | row |
if index< 101
Review.create(row.to_hash)
else
csv_out<<行
结束
索引+ = 1
结束
csv_out.close

之后,转储reviews.csv并将new.csv重命名为reviews.csv


I need to import a large CSV file, broken down to small chunks that will be imported every X hours.

I made the following rake task

task :import_reviews => :environment do
 require 'csv'
 CSV.foreach('reviews.csv', :headers => true) do |row|
  Review.create(row.to_hash)
 end
end

Using heroku scheduler I could let this task run every day, but I want to break it up in several chunks, for example 100 records every day:

That means I need to keep track of the last row imported, and start with that row += 1 the next time I would let the rake task run, how can I implement this?

Thanks in advance!

解决方案

Read the rest of the CSV in to an array and outside the CSV.foreach loop write to the same CSV file, so that it gets smaller each time. I suppose i don't have to give this in code but if necessary comment me and i'll do.

If you want to keep the CSV in a whole, add a field "pocessed" to the CSV and fill it with a 1 if read, next time filter these out.

EDIT: this isn't tested and sure could be better but just to show what i mean

require 'csv'
index = 1
csv_out = CSV::Writer.generate(File.open('new.csv', 'wb'))
CSV.foreach('reviews.csv', :headers => true) do |row|
  if index < 101
    Review.create(row.to_hash)
  else
    csv_out << row
  end
  index += 1
end
csv_out.close

afterward, dump reviews.csv and rename new.csv to reviews.csv

这篇关于从CSV导入小块记录(轨道上的红宝石)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆