awk或shell脚本来更改制表符分隔文件的格式 [英] awk or shell script to change format of a tab delimited file
本文介绍了awk或shell脚本来更改制表符分隔文件的格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我需要如下所述将制表符分隔的数据的格式从输入更改为输出格式,请帮助我编写脚本.
I need to change the format of my tab delimited data from the input to output format as mentioned below, kindly help me to write a script.
输入文件:
BRANCH_CODE DEPT_CODE ITEM_CODE UNIT_CODE 01/04/2017 02/04/2017 03/04/2017 04/04/2017 05/04/2017 06/04/2017 07/04/2017 08/04/2017 09/04/2017 10/04/2017
KI-01 DP-0001 10001 KG 31.5 45 72 84 67.5 39 57 22.5 22 56
KI-01 DP-0001 10002 KG 22 0 62 18 49 13 75 17 0 72
输出格式:
DOC_DATE BRANCH_CODE DEPT_CODE ITEM_CODE UNIT_CODE QTY
01/04/2017 KI-01 DP-0001 10001 KG 31.5
01/04/2017 KI-01 DP-0001 10002 KG 22
02/04/2017 KI-01 DP-0001 10001 KG 45
02/04/2017 KI-01 DP-0001 10002 KG 0
03/04/2017 KI-01 DP-0001 10001 KG 72
03/04/2017 KI-01 DP-0001 10002 KG 62
以此类推
我正在 .sh 文件中编写这样的代码.
I was writing a code like this in a .sh file.
#!/bin/bash
awk 'NR!=1{print $0}' input.tsv > temp_data_wo_header.tsv;
lc=$(wc -l < temp_data_wo_header.tsv);
for ((i=6; i<=15; i++))
do
echo "Constructing date file "$i" and ...";
(for (( c=1; c<=$lc; c++));
do
awk 'NR==1{print $'$i'}' input.tsv;
done
) > temp_date.tsv;
echo "Adding date to data file...";
paste <(awk '{print $1}' temp_date.tsv ) <(awk 'BEGIN { FS = "\t" } ; {print $1,$2,$3,$5,$'$i'}' temp_data_wo_header.tsv ) > "temp_day_"$i"_data.tsv";
echo "Finished adding...";
done;
还有其他方法可以用更好的代码来实现它.
is there any other way to do it in a better code.
推荐答案
自2D数组以来,这是GNU awk中的一个:
Here's one in GNU awk since using 2D arrays:
$ awk '
BEGIN {
FS=OFS="\t" } # set the delimiters
{
sub(/\r/,"",$NF) # in case of \r\n line endings
a[NR][1] # define array element
n=split($0,a[NR],FS) # split record to a[NR]
a[NR][4]=$1 OFS $2 OFS $3 OFS $4 # gather constants to one element
if(NR==1)
a[NR][4]="DOC_DATE" OFS a[NR][4] OFS "QTY"
}
END { # everything is in memory
print a[1][4]; # header print
for(j=5;j<=n;j++) # loop all data fields
for(i=2;i<=NR;i++) # loop all records
print a[1][j],a[i][4],a[i][j] # output
}' file
DOC_DATE BRANCH_CODE DEPT_CODE ITEM_CODE UNIT_CODE QTY
01/04/2017 KI-01 DP-0001 10001 KG 31.5
01/04/2017 KI-01 DP-0001 10002 KG 22
02/04/2017 KI-01 DP-0001 10001 KG 45
02/04/2017 KI-01 DP-0001 10002 KG 0
03/04/2017 KI-01 DP-0001 10001 KG 72
这篇关于awk或shell脚本来更改制表符分隔文件的格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文