Nokogiri XML 导入提要组织? [英] Nokogiri XML import feed organisation?

查看:42
本文介绍了Nokogiri XML 导入提要组织?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我建立了一个网站,该网站依赖于我目前使用 Nokogiri 解析的 XML 提要.尽管我的 Admin 控制器中目前有所有代码,但一切都运行良好,所以我实际上可以通过 URL 调用导入,即 /admin/import/.

I have built a site that relies on an XML feed that I currently parse with Nokogiri. Everything works fine and dandy although I have all the code currently within my Admin controller so I can actually invoke the import via a URL i.e. /admin/import/.

我不禁认为这不属于控制器.有没有更好的方法来做到这一点,即将代码移动到一个独立的 import.rb 文件中,以便它只能从控制台访问?如果是这样,我需要将此文件放在 /lib/ 目录中的什么位置?

I can't help but think that this doesn't belong in the controller. Is there a better way to do this, i.e. move the code into a stand alone import.rb file so it is only accessible from the console? If so where would I need to put this file, in the /lib/ directory?

这是一个代码片段:

class AdminController < ApplicationController

    def import
      f = File.open("#{Rails.root}/public/feed.xml")
      @doc = Nokogiri::XML(f)
      f.close

      ignore_list = [] # ignore list

      @doc.xpath("/*/product[not(name = following-sibling::product/name)]").each do |node|
        if !ignore_list.include? node.xpath("./programName").inner_text.strip
          Product.create(:name => clean_field(node.xpath("./name").inner_text).downcase, 
          :description => clean_field(node.xpath("./description").inner_text),
          :brand => Brand.find_or_create_by_name(clean_field_key(node.xpath("./brand").inner_text).downcase),         
          :merchant => Merchant.find_or_create_by_name(clean_field_key(node.xpath("./programName").inner_text).downcase),     
          :image => node.xpath("./imageUrl").inner_text.strip,
          :link => node.xpath("./productUrl").inner_text.strip,
          :category => Category.find_or_create_by_name(clean_field_key(node.xpath("./CategoryName").inner_text).downcase),
          :price => "£" + node.xpath("./price").inner_text.strip)
          print clean_field(node.xpath("./name").inner_text).downcase + "\n"       
        end
      end
    end
end

推荐答案

您的代码听起来像是作为 Rails 运行程序脚本运行的很好.它们是在您站点的正常 Rails 进程之外运行的脚本,但具有对 ActiveRecord 设置的完全访问权限,因此您可以轻松访问您的数据库.

Your code sounds like it would work well being run as a Rails runner script. They're scripts that run outside of the normal Rails process for your site, but have full access to the ActiveRecord setup so you can easily access your database.

我认为 Rails 对文件位置的要求不像对所有其他文件那样严格,但我会在app"下创建一个名为scripts"的子目录并将其放在那里.保持目录结构整洁对维护来说是件好事.

I don't think Rails is as strict about the file's location as it is for all its other files, but I'd create a subdirectory under 'app' called 'scripts' and put it there. Keeping a tidy directory structure is a good thing for maintenance.

您不会说您运行的是 Rails 3 还是以前的版本.如果您运行的是 Rails 3,请在 Rails 应用程序的命令行中键入 rails runner -h 以获取更多信息.

You don't say if you're running Rails 3 or a previous version. If you're running Rails 3, type rails runner -h at the command-line of your Rails app for more info.

有些人认为脚本应该使用 rake 运行,我同意如果他们正在操作文件和文件夹并对运行应用程序的 rails 空间进行一般维护.如果你重新执行作为数据库内务管理一部分的定期任务,或者在您的情况下,检索用于支持您的应用程序的内容,我认为它应该是一个运行器"任务.

Some folks feel that scripts should be run using rake, which I agree with IF they're manipulating files and folders and doing general maintenance of the rails-space your app runs in. If you're performing a periodic task that is part of the housekeeping of the database, or, in your case, retrieving content used in support of your app, I think it should be a "runner" task.

您可以构建功能,以便您仍然可以通过 URL 触发代码运行,但我认为这可能会被滥用,特别是如果您可以覆盖所需数据或用重复/冗余数据填充数据库.我认为最好通过操作系统启动的 cron 定期运行任务,只是为了保持良好的间隔,或者仅手动运行.如果您通过 URL 访问保持它可用,我建议您使用密码来帮助避免滥用.

You can build functionality so you could still trigger the code to run via a URL, but I think there's potential for abuse of that, especially if you could overwrite needed data or fill the database with duplicate/redundant data. I think it'd be better to make the task run periodically, via cron initiated by the OS probably, just to keep things on a nice interval, or only run manually. If you keep it available via URL access I'd recommend using a password to help avoid abuse.

最后,作为长期从事此工作的人,我建议您在代码中进行一些结构化和对齐:

Finally, as someone who's been doing this a long time, I'd recommend a bit of structure and alignment in your code:

Product.create(
  :name        => clean_field(node.xpath("./name").inner_text).downcase,
  :description => clean_field(node.xpath("./description").inner_text),
  :brand       => Brand.find_or_create_by_name(clean_field_key(node.xpath("./brand").inner_text).downcase),
  :merchant    => Merchant.find_or_create_by_name(clean_field_key(node.xpath("./programName").inner_text).downcase),
  :image       => node.xpath("./imageUrl").inner_text.strip,
  :link        => node.xpath("./productUrl").inner_text.strip,
  :category    => Category.find_or_create_by_name(clean_field_key(node.xpath("./CategoryName").inner_text).downcase),
  :price       => "£" + node.xpath("./price").inner_text.strip
)

简单的对齐可以大大帮助您维护代码,或者帮助保持最终维护它的人的理智.我可能会保持它的样子:

Simple alignment can go a long way to help you maintain your code, or help keep the sanity of someone down the line who ends up maintaining it. I'd probably keep it looking like:

Product.create(
  :name        => clean_field( node.xpath( "./name"        ).inner_text ).downcase,
  :description => clean_field( node.xpath( "./description" ).inner_text ),

  :brand       => Brand.find_or_create_by_name(    clean_field_key( node.xpath( "./brand"       ).inner_text ).downcase ),
  :merchant    => Merchant.find_or_create_by_name( clean_field_key( node.xpath( "./programName" ).inner_text ).downcase ),

  :image       => node.xpath( "./imageUrl"   ).inner_text.strip,
  :link        => node.xpath( "./productUrl" ).inner_text.strip,

  :category    => Category.find_or_create_by_name( clean_field_key( node.xpath( "./CategoryName" ).inner_text ).downcase ),

  :price       => "£" + node.xpath( "./price" ).inner_text.strip
)

但这只是我.我喜欢有更多的空白,特别是当有嵌套方法时,我喜欢在公共/相似函数之间有一些垂直对齐.我发现它可以更轻松地扫描代码并查看任何差异,这在您调试或查找特定内容时会有所帮助.同样,这只是我的偏好,但这是我多年来用多种不同语言编写代码时学到的东西.

but that's just me. I like having more whitespace, especially when there are nested methods, and I like having some vertical alignment among common/similar functions. I find it makes it easier to scan the code and see any differences, which helps when you are debugging or looking for a particular thing. Again, that's just my preference, but it's something I've learned over many years of writing code in a lot of different languages.

这篇关于Nokogiri XML 导入提要组织?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆