通过Perl从xls到csv读取日期 [英] reading dates from xls to csv through Perl
问题描述
我有一批像这样的excel文件
I have a batch of excel files with lines like
1/13/04 21
我正在尝试将它们转换为.csv,但是发现该行已转换为
I am trying to convert them to .csv, but find that the line is converted to
36537,21
事实证明,这是excel存储规则的副作用. Excel应该将日期存储为自1900年1月1日以来的天.根据该规则,这是错误的整数,对应于2001年1月12日,而不是2004年1月13日(这是由1/13/04
表示的日期).
It turns out this is a side-effect of excel's storage rules. Excel should store dates as days since Jan 1, 1900. By that rule, this is the wrong integer, corresponding to Jan 12, 2001 not Jan 13, 2004 (which is the date meant by 1/13/04
).
- Excel到底怎么会犯这个错误?
- 又如何避开此处有效的转换,如何获得原始的未格式化值?
这是代码的粗略草图:
my $xlsparser = Spreadsheet::ParseExcel->new();
my $xlsbook = $xlsparser->Parse('xls_test.xls');
my $xls = $xlsbook->{Worksheet}[0];
my $csv = '';
# then a loop over rows and columns with...
my $cell = $xls->get_cell( $row, $col );
$cellcon = $cell->unformatted();
$csv .= $cellcon;
如果我的阐述不够清楚或您无法重现该问题,以下是为我重现此问题的最小数据集和脚本:
In case my exposition isn't clear enough or you can't reproduce the issue, here is a minimal data set and script that reproduce it for me:
https://dl.dropboxusercontent.com/u/58760/softwareGrr/xls_example.pl https://dl.dropboxusercontent.com/u/58760/softwareGrr/junk.xls
推荐答案
如果要将Excel日期序列值格式36537,21转换为perl中的时间/日期变量,则可以使用自己的函数将日期. 功能下方
If you want to convert the Excel date serial value format 36537,21 to a time/date variables in perl, then you can use your own functions to convert the dates. Below the functions
sub date2excelvalue {
my($day1, $month, $year, $hour, $min, $sec) = @_;
my @cumul_d_in_m = (0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365);
my $doy = $cumul_d_in_m[$month - 1] + $day1;
#
full years + your day
for my $y(1900..$year) {
if ($y == $year) {
if ($month <= 2) {
#
dont add manually extra date
if inJanuary or February
last;
}
if ((($y % 4 == 0) && ($y % 100 != 0)) || ($y % 400 == 0) || ($y == 1900)) {
$doy++;#
leap year
}
} else {#
full years
$doy += 365;
if ((($y % 4 == 0) && ($y % 100 != 0)) || ($y % 400 == 0) || ($y == 1900)) {
$doy++;#
leap year
}
}
}#
end
for y# calculate second parts as a fraction of 86400 seconds
my $excel_decimaltimepart = 0;
my $total_seconds_from_time = ($hour * 60 * 60 + $min * 60 + $sec);
if ($total_seconds_from_time == 86400) {
$doy++;#
just add a day
} else {#
add decimal in excel
$excel_decimaltimepart = $total_seconds_from_time / (86400);
$excel_decimaltimepart = ~s / 0\. //;
}
return "$doy\.$excel_decimaltimepart";
}
sub excelvalue2date {
my($excelvalueintegerpart, $excelvaluedecimalpart) = @_;
my @cumul_d_in_m = (0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365);
my @cumul_d_in_m_leap = (0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335, 366);
my @cumul_d_in_m_selected;
my($day1, $month, $year, $hour, $min, $sec);
$day1 = 0;#
all days all years
my $days_in_year;
my $acumdays_per_month;
my $daysinmonth;
my $day;
#
full years + your day
for my $y(1900. .3000) {
my $leap_year = 0;#
leap year
my $leap_year_mask = 0;#
leap year
if ((($y % 4 == 0) && ($y % 100 != 0)) || ($y % 400 == 0) || ($y == 1900)) {
$leap_year = 1;#
leap year
@cumul_d_in_m_selected = @cumul_d_in_m_leap;
} else {
$leap_year = 0;#
leap year
@cumul_d_in_m_selected = @cumul_d_in_m;
}
if (($day1 + (365 + $leap_year)) > $excelvalueintegerpart) {
#
found this year $y
$year = $y;
print "year $y\n";
$days_in_year = $excelvalueintegerpart - $day1;
$acumdays_per_month = 0;
print "excelvalueintegerpart $excelvalueintegerpart\n";
print "day1 $day1\n";
print "daysinyear $days_in_year\n";
for my $i(0..$# cumul_d_in_m) {
if ($i == $# cumul_d_in_m) {
$month = $i + 1;#
month 12 December
$day = $days_in_year - $cumul_d_in_m_selected[$i];
last;
} else {
if (($days_in_year > ($cumul_d_in_m_selected[$i])) && ($days_in_year <= ($cumul_d_in_m_selected[$i + 1]))) {
$month = $i + 1;
$day = $days_in_year - $cumul_d_in_m_selected[$i];
last;
}
}
}#
end
for $i months
# end year
last;
} else {#
full years
$day1 += (365 + $leap_year);
}
}#
end
for years interger part comparator
my $total_seconds_inaday;
$total_seconds_inaday = "0\.$excelvaluedecimalpart" * 86400;
$sec = $total_seconds_inaday;
$hour = int($sec / (60 * 60));
$sec -= $hour * (60 * 60);
$min = int($sec / 60);
$sec -= $min * (60);
$sec = int($sec);
return ($day, $month, $year, $hour, $min, $sec);
}
my $excelvariable = date2excelvalue(1, 3, 2018, 14, 14, 30);
print "Excel variable: $excelvariable\n";
my($integerpart, $decimalwithoutzero) = ($1, $2) if ($excelvariable = ~m / (\d + )\.(\d + ) / );
my($day1, $month, $year, $hour, $min, $sec) = excelvalue2date($integerpart, $decimalwithoutzero);
print "Excel Date from value: $day1, $month, $year, $hour, $min, $sec\n";
享受吧!
这篇关于通过Perl从xls到csv读取日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!