rubyXL(Errno :: ENOENT) [英] rubyXL (Errno::ENOENT)

查看：96 发布时间：2021/5/5 19:14:14 ruby excel web-crawler rubyxl

本文介绍了rubyXL(Errno :: ENOENT)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在使用rubyXL构建的爬虫遇到了麻烦.它正确地遍历了我的文件系统，但是我收到一个(Errno :: ENOENT)错误.我已经签出了所有rubyXL代码，所有内容似乎都已签出.我的代码附在下面-有什么建议吗?

 /Users/.../testdata.xlsx/用户/.../moretestdata.xlsx/Users/.../Lab 1 Data.xlsx/Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:404:在初始化"中:无此类文件或目录-/Users/Dylan/.../sheet6.xml(Errno :: ENOENT)来自/Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:404:在打开"中来自/Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:404:在解压缩块中"来自/Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:402:在"upto"中来自/Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:402:在解压缩"中来自/Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:47:位于"parse"中来自xlcrawler.rb:9:in在xlcrawler中的块中来自/Users/Dylan/.rvm/rubies/ruby-1.9.3-p327/lib/ruby/1.9.1/find.rb:41:in在查找中阻止"来自/Users/Dylan/.rvm/rubies/ruby-1.9.3-p327/lib/ruby/1.9.1/find.rb:40:在`catch'中来自/Users/Dylan/.rvm/rubies/ruby-1.9.3-p327/lib/ruby/1.9.1/find.rb:40:在`find'中来自xlcrawler.rb:6:在`xlcrawler'中来自xlcrawler.rb:22:in'< main>'中

 需要查找"需要'rubyXL'def xlcrawler(路径)计数= 0find.find(path)做|文件|#开始迭代指定目录的每个文件if file =〜/\b.xlsx$\b/#检查给定文件是否为xlsx格式放置文件#确保搜寻器正在遍历文件系统workbook = RubyXL :: Parser.parse(file).worksheets#创建一个对象，其中包含excel工作簿的所有工作表workbook.each |工作表|#在每个工作表上开始迭代data = worksheet.extract_data.to_s#提取给定工作表的数据-必须转换为字符串才能匹配正则表达式如果数据=〜/regex/放置文件计数+ = 1结尾结尾结尾结尾放置找到了## {count}个文件"结尾xlcrawler('/Users/')

解决方案

我在github上的rubyXL代码中进行了一些挖掘，看来decompress方法中存在错误.

  files ['styles'] = Nokogiri :: XML.parse(File.open(File.join(dir_path，'xl'，'styles.xml')，'r')))@num_sheets = files ['workbook'].css('sheets').children.size@num_sheets =整数(@num_sheets)#将所有工作表xml文件添加到文件哈希i = 11.upto(@num_sheets)做filename ='sheet'+ i.to_s#< -----此处有错误files [i] = Nokogiri :: XML.parse(File.open(File.join(dir_path，'xl'，'worksheets'，filename +'.xml')，'r'))i = i + 1结尾

此代码块对excel中的工作表编号进行了假设，但这是不正确的.此代码仅计算纸张数量，然后按数字分配它们.但是，如果删除工作表然后创建一个新工作表，则会破坏数字序列.

如果您检查您的 Lab Data 1.xlsx 文件，您将看到如果通过拉起vba开发人员窗口(按alt + F11)没有sheet6，您应该会看到类似

如您所见，当i = 6时，这种安排将使for循环失败并导致异常.

I'm having trouble with a crawler I'm building using rubyXL. It's correctly traversing my file system, but I am receiving an (Errno::ENOENT) error. I've checked out all the rubyXL code and everything appears to check out. My code is attached below - any suggestions?

/Users/.../testdata.xlsx
/Users/.../moretestdata.xlsx
/Users/.../Lab 1 Data.xlsx
/Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:404:in `initialize': No such file or directory - /Users/Dylan/.../sheet6.xml (Errno::ENOENT)
    from /Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:404:in `open'
    from /Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:404:in `block in decompress'
    from /Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:402:in `upto'
    from /Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:402:in `decompress'
    from /Users/Dylan/.rvm/gems/ruby-1.9.3-p327/gems/rubyXL-1.2.10/lib/rubyXL/parser.rb:47:in `parse'
    from xlcrawler.rb:9:in `block in xlcrawler'
    from /Users/Dylan/.rvm/rubies/ruby-1.9.3-p327/lib/ruby/1.9.1/find.rb:41:in `block in find'
    from /Users/Dylan/.rvm/rubies/ruby-1.9.3-p327/lib/ruby/1.9.1/find.rb:40:in `catch'
    from /Users/Dylan/.rvm/rubies/ruby-1.9.3-p327/lib/ruby/1.9.1/find.rb:40:in `find'
    from xlcrawler.rb:6:in `xlcrawler'
    from xlcrawler.rb:22:in `<main>'

require 'find'
require 'rubyXL'

def xlcrawler(path)
  count = 0
  Find.find(path) do |file|                                # begin iteration of each file of a specified directory
    if file =~ /\b.xlsx$\b/                                # check if a given file is xlsx format
      puts file                                            # ensure crawler is traversing the file system
      workbook = RubyXL::Parser.parse(file).worksheets     # creates an object containing all worksheets of an excel workbook
      workbook.each do |worksheet|                         # begin iteration over each worksheet
        data = worksheet.extract_data.to_s                 # extract data of a given worksheet - must be converted to a string in order to match a regex
        if data =~ /regex/
          puts file
          count += 1
        end      
      end
    end
  end
  puts "#{count} files were found"
end

xlcrawler('/Users/')

解决方案

I did some digging through the rubyXL code on github and it looks like there is a bug in the decompress method.

  files['styles'] = Nokogiri::XML.parse(File.open(File.join(dir_path,'xl','styles.xml'),'r'))
  @num_sheets = files['workbook'].css('sheets').children.size
  @num_sheets = Integer(@num_sheets)

  #adds all worksheet xml files to files hash
  i=1
  1.upto(@num_sheets) do
    filename = 'sheet'+i.to_s # <----- BUG IS HERE
    files[i] = Nokogiri::XML.parse(File.open(File.join(dir_path,'xl','worksheets',filename+'.xml'),'r'))
    i=i+1
  end

This block of code makes an assumption about sheet numbering in excel which is not true. This code simply counts the number of sheets, and assigns them numerically. However if you delete a sheet then create a new sheet the numerical sequence is broken.

If you check your Lab Data 1.xlsx file you will see that there is no sheet6 if you pull up the vba developer window (by pressing alt + F11) you should see something like

As you can see this arrangement will defeat the for loop and cause an exception when i = 6.

这篇关于rubyXL(Errno :: ENOENT)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

rubyXL(Errno :: ENOENT) [英] rubyXL (Errno::ENOENT)

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

rubyXL(Errno :: ENOENT) [英] rubyXL (Errno::ENOENT)

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭