使用awk或者sed修复日期格式 [英] Use sed or awk to fix date format

查看:305
本文介绍了使用awk或者sed修复日期格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用庆典脚本包含的表到.csv文件中的HTML转换。

I'm trying to convert a HTML containing a table to a .csv file using a bash script.

到目前为止,我acomplished以下步骤:

So far I've acomplished the following steps:


  1. 转换为Unix格式( DOS2UNIX的

  2. 删除所有空格和制表符(带有 SED的/ [\\ t] // G'

  3. 删除所有空行(与 sed的':一个; N; $ BA; S / \\ n // G')(这是necesary,因为HTML文件具有表的每个单元一个空白行...这不是我的错)

  4. 删除unnecesary < TD> < TR> 标签(与 SED的/< T> // G'

  5. 替换< / TD> 用,(与 SED的/< \\ / TD /,/ G'

  6. 替换< / TR> 与尾线( \\ n )字符( SED的/< \\ / TR / \\ n / g的

  1. Convert to Unix format (with dos2unix)
  2. Remove all spaces and tabs (with sed 's/[ \t]//g')
  3. Remove all the blank lines (with sed ':a;N;$!ba;s/\n//g') (this is necesary, because the HTML file has a blank line for each cell of the table... that's not my fault)
  4. Remove the unnecesary <td> and <tr> tags (with sed 's/<t.>//g')
  5. Replace </td> with ',' (with sed 's/<\/td/,/g')
  6. Replace </tr> with end-of-line (\n) characters (with sed 's/<\/tr/\n/g')

当然,我把所有这一切的管道。到目前为止,它的伟大的工作。还有最后一个步骤我坚持用:表中有日期的列,它的格式为 DD / MM / YYYY ,我想将它们转换为 YYYY-MM-DD

Of course, I'm putting all this in a pipeline. So far, it's working great. There's one final step I'm stuck with: The table has a column with dates, which has the format dd/mm/yyyy, and I'd like to convert them to yyyy-mm-dd.

有一个(简单)的方式来做到这一点(与 SED AWK )?

Is there a (simple) way to do it (with sed or awk)?

数据样本(全 SED 管道后):

500,2,13/09/2007,30000.00,12,B-1
501,2,15/09/2007,14000.00,8,B-2

预期结果

500,2,2007-09-13,30000.00,12,B-1
501,2,2007-09-15,14000.00,8,B-2

我需要做的原因是因为我需要这个数据导入到MySQL。我可以在Excel中打开文件并手工修改格式,但我想跳过。

The reason I need to do this is because I need to import this data to MySQL. I could open the file in Excel and change the format by hand, but I would like to skip that.

推荐答案

awk的可以很容易地做这个任务pretty:

Awk can do this task pretty easily:

awk '
    BEGIN { FS = OFS = "," } 
    { split($3, date, /\//)
      $3 = date[3] "-" date[2] "-" date[1]
      print $0 
    }
' infile

它产生的:

500,2,2007-09-13,30000.00,12,B-1
501,2,2007-09-15,14000.00,8,B-2

这篇关于使用awk或者sed修复日期格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆