在 Perl 中使用 Spreadsheet::ParseExcel,但需要帮助 [英] Using Spreadsheet::ParseExcel in Perl, but need help

查看:50
本文介绍了在 Perl 中使用 Spreadsheet::ParseExcel,但需要帮助的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用 Spreadsheet::ParseExcel 的 Perl 程序.但是,出现了两个我一直无法弄清楚如何解决的困难.程序的脚本如下:

I have a Perl program using Spreadsheet::ParseExcel. However, there are two difficulties that have arisen that I have been unable to figure out how to solve. The script for the program is as follows:

#!/usr/bin/perl
use strict;
use warnings;
use Spreadsheet::ParseExcel;
use WordNet::Similarity::lesk;
use WordNet::QueryData;

my $wn = WordNet::QueryData->new();
my $lesk = WordNet::Similarity::lesk->new($wn);
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse ( 'input.xls' );

if ( !defined $workbook ) {
   die $parser->error(), ".\n";
}

WORKSHEET:
for my $worksheet ( $workbook->worksheets() ) {

    my $sheetname = $worksheet->get_name();
    my ( $row_min, $row_max ) = $worksheet->row_range();
    my ( $col_min, $col_max ) = $worksheet->col_range();
    my $target_col;
    my $response_col;

# Skip worksheet if it doesn't contain data
    if ( $row_min > $row_max ) {
       warn "\tWorksheet $sheetname doesn't contain data. \n";
       next WORKSHEET;
    }

# Check for column headers
    COLUMN:
    for my $col ( $col_min .. $col_max ) {

        my $cell = $worksheet->get_cell( $row_min, $col );
        next COLUMN unless $cell;

        $target_col   = $col if $cell->value() eq 'Target';
        $response_col = $col if $cell->value() eq 'Response';
    }

    if ( defined $target_col && defined $response_col ) {

        ROW:
        for my $row ( $row_min + 1 .. $row_max ) {
            my $target_cell   = $worksheet->get_cell( $row, $target_col);
            my $response_cell = $worksheet->get_cell( $row, $response_col);
            if ( defined $target_cell && defined $response_cell ) {
                my $target   = $target_cell->value();
                my $response = $response_cell->value();

                my $value    = $lesk->getRelatedness( $target, $response );

                print "Worksheet   = $sheetname\n";
                print "Row         = $row\n";
                print "Target      = $target\n";
                print "Response    = $response\n";
                print "Relatedness = $value\n";                

            }
            else {

                warn "\tWroksheet $sheetname, Row = $row doesn't contain target and response data.\n";
                next ROW;
            }
        }    
    }
    else {

        warn "\tWorksheet $sheetname: Didn't find Target and Response headings.\n";
        next WORKSHEET;
    }  
}

所以,我的两个问题:

首先,有时程序会返回错误在文件中找不到 Excel 数据",即使数据在那里.每个 Excel 文件的格式都相同.只有一张纸,A 列和 B 列分别标记为目标"和响应",其下方是单词列表.但是,它并不总是返回此错误.它适用于一个 Excel 文件,但不适用于不同的 Excel 文件,即使两者的格式完全相同(是的,它们也是相同的文件类型).我找不到任何理由不读取第二个文件,因为它与第一个文件相同.唯一的区别是第二个文件是使用 Excel 宏创建的;然而,这有什么关系呢?文件类型和格式完全相同.

First of all, sometimes the program returns the error "No Excel data found in file," even though the data is there. Each Excel file is formatted the same way. There is only one sheet, with the A and B columns labelled 'Target' and 'Response,' respectively, with a list of words beneath them. However, it does not ALWAYS return this error. It works for one Excel file, but it does not work for a different one, even though both are formatted the exact same way (and yes, they are both the same file type, as well). I cannot find any reason for it to not read the second file, because it is identical to the first. The only difference is that the second file was created using an Excel macro; however, why would that matter? The file types and format are exactly the same.

其次,变量 '$target' 和 '$response' 需要格式化为字符串,以便 'my $value' 表达式工作.如何将它们转换为字符串格式?分配给每个变量的值是来自 Excel 电子表格相应单元格的一个词.我不知道那是什么格式(而且 Perl 中没有明显的方法供我检查).

Second, the variables '$target' and '$response' need to be formatted as strings in order for the 'my $value' expression to work. How do I convert them into string format? The value assigned to each variable is a word from the appropriate cell of the Excel spreadsheet. I don't know what format that is (and there is no apparent way in Perl for me to check).

有什么建议吗?

推荐答案

关于您的第一个问题,未找到数据"错误表示文件格式存在问题.我已经在伪 Excel 文件(例如具有 xls 扩展名的 Html 或 CSV 文件)中看到了此错误.我还发现第三方应用生成的格式错误的文件会出现此错误.

In relation to your first question, the "no data found" error indicates some problem with the file format. I've seen this error with pseudo-Excel files such as Html or CSV files that have an xls extension. I've also seen this error with mal-formed files generated by third party apps.

您可以通过对工作文件和非工作文件进行 hexdump/xxd 转储并查看整体结构是否大致相同(例如,如果它在开始时具有相似的幻数并且不是't Html).

You could do an initial verification of the files by doing a hexdump/xxd dump of a working and non working file and seeing if the overall structure is approximately the same (for example if it has similar magic numbers at the start and isn't Html).

它也可能是电子表格::ParseExcel 的问题.我是那个模块的维护者.如果您愿意,可以通过文档中的电子邮件地址向我发送好"和坏"文件,我会查看它们.

It could also be an issue with Spreadsheet::ParseExcel. I am the maintainer of that module. If you like you could send me on a "good" and "bad" file, at the email address in the docs, and I will have a look at them.

这篇关于在 Perl 中使用 Spreadsheet::ParseExcel,但需要帮助的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆