使用Spreadsheet :: ParseExcel在XLS文件中获取零个单元格 [英] Fetching zero cells in XLS file using Spreadsheet::ParseExcel

查看:120
本文介绍了使用Spreadsheet :: ParseExcel在XLS文件中获取零个单元格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先请参阅堆栈溢出问题 使用Spreadsheet :: WriteExcel .

First refer to Stack Overflow question Using Spreadsheet::WriteExcel.

数据结构类似于以下内容:

The data structure looks alike the following:

      col1      col2    col3   col4   col5
row1  School    1
row2  Dean      John
row3  No.stu.   55
row4  some irrelevant stuff
row5  School2   2
row6  Dean      Tony
row7  No. stu.  60
row8  some irrelevant stuff
row9  School    3
row10 Dean      James
row11 No.stu.   56
row12 No. teacher 20
row13 School    4
row14 Dean      Tom
row15 No.stu.   79
row16 No. teacher 21
row17 course
row18           math    2
row19           eng     4
row20 teacher   name    age   gender   race
row21           Jane    20    female   white
row22 student   name    Lee
row23           SAT     1434
row24           gender  male

我想要实现的输出是:

      col1  col2  col3     col4          col5         col6          col7        col8       col9
row1 School Dean No.stu.  No. teacher  course_math  course_eng  teacher_name  teacher_age  teacher_gender    teacher_race    student_name   student_SAT   student_gender
row2 1      John  55
row3 2      Tony  60
row4 3      James 56       20
row5 4      Tome  79       21              2           4            Jane        20          female                white         Lee         1434          male

感谢gangabass,我得到的代码是:

And thanks to gangabass, the code I have been given is that:

use strict;
use warnings;

use Spreadsheet::ParseExcel;
use FindBin qw($Bin);

my ($infile) = @ARGV;

my $parser   = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse("$Bin/Test.xls");
die $parser->error unless defined $workbook;
my ($worksheet) = $workbook->worksheets();

my %data;
my $row    = 0;
my $school = "";
while (1) {
    my $cell = $worksheet->get_cell($row, 0);
    last unless defined($cell);

    my $key = $cell->value();
    my $value = $worksheet->get_cell($row++, 1)->value();

    if ($key eq "School") {

        $school = $value;
        next;
    }

    $data{$school}->{$key} = $value;
}
sleep 1;

我从解析row17-row19开始.我遇到的第一个问题是(row17,col3)中的空单元格.代码到达此处时会出错.知道Excel单元格可以区分空"和空白"后,我可以通过将原始XLS文件的单元格格式设置为除通用"之外的其他格式来使用它.但是,这只是一个临时解决方案.我想知道是否可以使用任何命令来获取空单元格.我已经尝试通过添加以下内容来尝试unformatted():

I start from parsing row17-row19. The first problem I came across was the empty cell in (row17, col3). The code errors out when it arrives here. Knowing that an Excel cell differentiates between 'empty' and 'blank', I can play around with it by setting the cell format of the original XLS file to others except 'general'. However, it is just a temporary solution. I wonder if there are any commands that I could use to fetch the empty cell. I already tried unformatted() by adding:

my $unformattedvalue = $worksheet->get_cell( $row++, 1 )->unformatted();

但是,它无法解决问题.

It, however, does not work out.

然后我尝试使用以下代码在课程"状态下指定数据结构:

I then tried to specify the data structure under 'course' state using the following code:

my %data;
my $row    = 0;
my $school = "";
my $course = ""; #Initial value for the state of course
while (1) {
    my $cell = $worksheet->get_cell($row, 0);
    last unless defined($cell);

    my $key = $cell->value();
    my $value  = $worksheet->get_cell( $row++, 1 )->value();
    my $value1 = $worksheet->get_cell( $row++, 2 )->value(); #Fetching the value in column 3

    if ($key eq "School") {

        $school = $value;
        next;
    }

    if ($key eq "course") { #Just mimicking the how we construct the structure of 'School'

         $course = $value1;
         next;
    }

    $data{$school}->{$key} = $value;  #Must be something wrong here, but can not figure out
}

该代码未通过,并给出了Can't call method "value" on an undefined value at xxx line of 'my $value1 = $worksheet->get_cell( $row++, 2 )->value()';

The code does not go through and gives Can't call method "value" on an undefined value at xxx line of 'my $value1 = $worksheet->get_cell( $row++, 2 )->value()';

简而言之,我的问题是:

So in short, my questions are:

  1. 通常,如何在不干预过程的情况下获取XLS中的空单元格?
  2. 如何使用Spreadsheet :: ParseExcel

  1. In general, how do I fetch empty cells in XLS without intervening the process?
  2. How do I parse a structure like following by using Spreadsheet::ParseExcel

row17 course
row18           math    2
row19           eng     4

推荐答案

在不应该进行的情况下,您正在执行$row++.

You are doing a $row++ when you shouldn't be.

尝试使用以下代码作为起点:

Try this code instead as your starting point:

my %data;
my $state = "";
my $school = "";
my $student = "";
my ( $row_min, $row_max ) = $worksheet->row_range();
my $row = $row_min;
while ($row <= $row_max) {
    my $cell0 = $worksheet->get_cell( $row, 0 );
    my $cell1 = $worksheet->get_cell( $row, 1 );

    if (defined($cell0)) {
        my $key = $cell0->value();
        if ($key eq 'School') {
            $state = 'school';
            $school = $cell1->value();
        } elsif ($key eq 'course') {
            $state = 'course';
        } elsif ($key eq 'teacher') {
            $state = 'teacher';
        } elsif ($key eq 'student') {
            $state = 'student');
            $student = $worksheet->get_cell( $row, 2 )->value();
        } else {
            $data{$school}{$key} = $cell1->value();
        }
    } elsif ($state eq 'course') {
        # process columns for course
    } elsif ($state eq 'teacher') {
        # process columns for teacher
    } elsif ($state eq 'student') {
        # process columns for student
    }
    $row++;
}
use Data::Dumper;
print Dumper(\%data);

更新:

要处理教师行,我首先要在第一个while循环之外声明%teacher_columns哈希,并使用col_range()方法声明$ col_min和$ col_max.然后在$key eq 'teacher' elsif子句中,我将执行以下操作:

To process the teacher rows, I would first declare a %teacher_columns hash outside the first while loop and also the $col_min and $col_max using the col_range() method. Then inside the $key eq 'teacher' elsif clause, I would do something like this:

%teacher_columns = (); # clear it out in case column names are different for this school
for my $col (2 .. $col_max) {
    my $cell = $worksheet->get_cell( $row, $col );
    $teacher_columns{$cell->value()} = $col if defined($cell) and $cell->value();
}

然后在state eq 'teacher' elsif子句中,我将执行以下操作:

Then in the state eq 'teacher' elsif clause, I would do something like this:

foreach my $key (keys %teacher_columns) {
    my $cell = $worksheet->get_cell($row, $teacher_columns{$key});
    if (defined($cell)) {
      # store cell data into proper location of your data hash
    }
}

这篇关于使用Spreadsheet :: ParseExcel在XLS文件中获取零个单元格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆