通过Perl从xls到csv读取日期 [英] reading dates from xls to csv through Perl

查看:90
本文介绍了通过Perl从xls到csv读取日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一批像这样的excel文件

I have a batch of excel files with lines like

1/13/04 21

我正在尝试将它们转换为.csv,但是发现该行已转换为

I am trying to convert them to .csv, but find that the line is converted to

36537,21

事实证明,这是excel存储规则的副作用. Excel应该将日期存储为自1900年1月1日以来的天.根据该规则,这是错误的整数,对应于2001年1月12日,而不是2004年1月13日(这是由1/13/04表示的日期).

It turns out this is a side-effect of excel's storage rules. Excel should store dates as days since Jan 1, 1900. By that rule, this is the wrong integer, corresponding to Jan 12, 2001 not Jan 13, 2004 (which is the date meant by 1/13/04).

  • Excel到底怎么会犯这个错误?
  • 又如何避开此处有效的转换,如何获得原始的未格式化值?

这是代码的粗略草图:

my $xlsparser = Spreadsheet::ParseExcel->new();
my $xlsbook = $xlsparser->Parse('xls_test.xls');
my $xls = $xlsbook->{Worksheet}[0];
my $csv = '';

# then a loop over rows and columns with...
  my $cell = $xls->get_cell( $row, $col );
  $cellcon = $cell->unformatted();
  $csv .= $cellcon; 

如果我的阐述不够清楚或您无法重现该问题,以下是为我重现此问题的最小数据集和脚本:

In case my exposition isn't clear enough or you can't reproduce the issue, here is a minimal data set and script that reproduce it for me:

https://dl.dropboxusercontent.com/u/58760/softwareGrr/xls_example.pl https://dl.dropboxusercontent.com/u/58760/softwareGrr/junk.xls

推荐答案

如果要将Excel日期序列值格式36537,21转换为perl中的时间/日期变量,则可以使用自己的函数将日期. 功能下方

If you want to convert the Excel date serial value format 36537,21 to a time/date variables in perl, then you can use your own functions to convert the dates. Below the functions

sub date2excelvalue {
  my($day1, $month, $year, $hour, $min, $sec) = @_;
  my @cumul_d_in_m = (0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365);
  my $doy = $cumul_d_in_m[$month - 1] + $day1;

  #
  full years + your day
  for my $y(1900..$year) {
    if ($y == $year) {
      if ($month <= 2) {

        #
        dont add manually extra date
        if inJanuary or February
        last;
      }
      if ((($y % 4 == 0) && ($y % 100 != 0)) || ($y % 400 == 0) || ($y == 1900)) {
        $doy++;#
        leap year
      }
    } else {#
      full years
      $doy += 365;
      if ((($y % 4 == 0) && ($y % 100 != 0)) || ($y % 400 == 0) || ($y == 1900)) {
        $doy++;#
        leap year
      }

    }
  }#
  end
  for y# calculate second parts as a fraction of 86400 seconds
  my $excel_decimaltimepart = 0;
  my $total_seconds_from_time = ($hour * 60 * 60 + $min * 60 + $sec);
  if ($total_seconds_from_time == 86400) {
    $doy++;#
    just add a day
  } else {#
    add decimal in excel
    $excel_decimaltimepart = $total_seconds_from_time / (86400);
    $excel_decimaltimepart = ~s / 0\. //;
  }
  return "$doy\.$excel_decimaltimepart";

}

sub excelvalue2date {
  my($excelvalueintegerpart, $excelvaluedecimalpart) = @_;
  my @cumul_d_in_m = (0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365);
  my @cumul_d_in_m_leap = (0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335, 366);
  my @cumul_d_in_m_selected;
  my($day1, $month, $year, $hour, $min, $sec);
  $day1 = 0;#
  all days all years
  my $days_in_year;
  my $acumdays_per_month;
  my $daysinmonth;
  my $day;

  #
  full years + your day
  for my $y(1900. .3000) {
    my $leap_year = 0;#
    leap year
    my $leap_year_mask = 0;#
    leap year
    if ((($y % 4 == 0) && ($y % 100 != 0)) || ($y % 400 == 0) || ($y == 1900)) {
      $leap_year = 1;#
      leap year
      @cumul_d_in_m_selected = @cumul_d_in_m_leap;

    } else {
      $leap_year = 0;#
      leap year
      @cumul_d_in_m_selected = @cumul_d_in_m;
    }

    if (($day1 + (365 + $leap_year)) > $excelvalueintegerpart) {

      #
      found this year $y
      $year = $y;
      print "year $y\n";

      $days_in_year = $excelvalueintegerpart - $day1;
      $acumdays_per_month = 0;
      print "excelvalueintegerpart  $excelvalueintegerpart\n";
      print "day1  $day1\n";
      print "daysinyear $days_in_year\n";
      for my $i(0..$# cumul_d_in_m) {
        if ($i == $# cumul_d_in_m) {
          $month = $i + 1;#
          month 12 December
          $day = $days_in_year - $cumul_d_in_m_selected[$i];
          last;

        } else {

          if (($days_in_year > ($cumul_d_in_m_selected[$i])) && ($days_in_year <= ($cumul_d_in_m_selected[$i + 1]))) {
            $month = $i + 1;
            $day = $days_in_year - $cumul_d_in_m_selected[$i];
            last;
          }

        }

      }#
      end
      for $i months

      # end year
      last;

    } else {#
      full years
      $day1 += (365 + $leap_year);
    }

  }#
  end
  for years interger part comparator

  my $total_seconds_inaday;
  $total_seconds_inaday = "0\.$excelvaluedecimalpart" * 86400;

  $sec = $total_seconds_inaday;
  $hour = int($sec / (60 * 60));
  $sec -= $hour * (60 * 60);
  $min = int($sec / 60);
  $sec -= $min * (60);
  $sec = int($sec);
  return ($day, $month, $year, $hour, $min, $sec);

}
my $excelvariable = date2excelvalue(1, 3, 2018, 14, 14, 30);
print "Excel variable: $excelvariable\n";
my($integerpart, $decimalwithoutzero) = ($1, $2) if ($excelvariable = ~m / (\d + )\.(\d + ) / );
my($day1, $month, $year, $hour, $min, $sec) = excelvalue2date($integerpart, $decimalwithoutzero);
print "Excel Date from value: $day1, $month, $year, $hour, $min, $sec\n";

享受吧!

这篇关于通过Perl从xls到csv读取日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆