移调使用AWK或Perl [英] Transpose using AWK or Perl

查看:110
本文介绍了移调使用AWK或Perl的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好我想用AWK或Perl得到如下格式的输出文件。我的输入文件是一个空格分隔文本文件。这类似于矿的较早的问题,但在这种情况下,输入和输出无格式。我列的位置可能会改变,因此将AP preciate不引用列号的技术

输入文件

  ID数量颜色形状大小,颜色形状大小的彩色外形尺寸
1 10蓝色方块10红三角12粉圆20
2月12日黄五角大楼3橙色的矩形4紫色椭圆6

所需的输出

  ID颜色外形尺寸
1蓝色方形10
1红三角12
1粉圆20
2黄五边形3
2橙色的矩形4
2紫色椭圆形6

我使用由丹尼斯·威廉姆森此code。唯一的问题是输出我得到了在转场没有空间分离。我需要一个空间分隔

 #!的/ usr /斌/的awk -f
开始 {
col_list =量颜色形状
#使用B(空白)在输出前加空格或
#格式化字符串(例如%6分贝)之后,但一般使用数字参数要反复多条线路上#列可能出现在任何地方在
#输入,但它们将被输出一起在该行的开头
repeat_fields [ID]
#因为这些单独设置的,我们不会使用B钮
repeat_fmt [ID] =%-1s
#附加字段重复每行NCOLS =拆分(col_list,COLS)对于(i = 1; I< = NCOLS;我++){
    col_names [COLS [I]
    形式[COLS [I] =%-1s
}
}
#使用标题行保存列的位置
FNR == 1 {
对于(i = 1; I< = NF;我++){
    如果($ i的repeat_fields){
        重复[++ nrepeats] =我
        repeat_look [我] =我
        rformats [I] = repeat_fmt [$ i]
    }
    如果($ i的col_names){
        col_nums [++ N] =我
        col_look [我] =我
        格式[I] =形式[$ i]
    }
}
#打印标题行
对于(i = 1; I< = nrepeats;我++){
    F = rformats [重复[I]
    子(D,S,f)的
    GSUB(B,中,f)
    printf的楼$重复[I]
}
对于(i = 1; I< = NCOLS;我++){
    F =格式[col_nums [I]
    子(D,S,f)的
    GSUB(B,中,f)
    printf的楼$ col_nums [I]
}
printf的\\ n
下一个
}{
对于(i = 1; I< = NF;我++){
    如果(我repeat_look){
        F = rformats [I]
        GSUB(B,中,f)
        repeat_out = repeat_out的sprintf(F,$ I)    }
    如果(我col_look){
        F =格式[I]
        GSUB(B,中,f)
        OUT = sprintf的出(F,$ I)
        科隆++
    }
    如果(科隆== NCOLS){
        打印出来repeat_out
        走出=
        科隆= 0
    }
}
repeat_out =
}

输出

  ID quantitycolourshape
1 10bluesquare
1 redtrianglepink
2 circle12yellow
2 pentagonorangerectangle

我的道歉不包括有关实际文件前面的所有信息。我这样做只是为了简单起见,但它并没有抓住我的所有要求。

在我的实际文件我期待转场n_cell和n_bsc节点现场儿童

 节点现场儿童n_cell n_bsc

这里是我的

解决方案

 <取代;
打印(ID颜色形状大小\\ n);而(小于&GT){
   我@combined_fields =拆分;
   我的$ id =移(@combined_fields);
   而(@combined_fields){
       我的@fields =($编号,剪接(@combined_fields,0,3));
       打印(加入('',@fields),\\ n);
   }
}

Hi I would like to use AWK or Perl to get an output file in the format below. My input file is a space separated text file. This is similar to an earlier question of mine, but in this case the input and output has no formatting. My column positions may change so would appreciate a technique which does not reference column number

Input File

id quantity colour shape size colour shape size colour shape size
1 10 blue square 10 red triangle 12 pink circle 20
2 12 yellow pentagon 3 orange rectangle 4 purple oval 6

Desired Output

id colour shape size
1 blue square 10
1 red triangle 12
1 pink circle 20
2 yellow pentagon 3
2 orange rectangle 4
2 purple oval 6

I am using this code by Dennis Williamson. Only problem is the output I get has no space separation in the transposed fields. I require one space separation

#!/usr/bin/awk -f
BEGIN {
col_list = "quantity colour shape"
# Use a B ("blank") to add spaces in the output before or
# after a format string (e.g. %6dB), but generally use the numeric argument

# columns to be repeated on multiple lines may appear anywhere in
# the input, but they will be output together at the beginning of the line
repeat_fields["id"]
# since these are individually set we won't use B
repeat_fmt["id"] = "%-1s "
# additional fields to repeat on each line

ncols = split(col_list, cols)

for (i = 1; i <= ncols; i++) {
    col_names[cols[i]]
    forms[cols[i]] = "%-1s"
}
}


# save the positions of the columns using the header line
FNR == 1 {
for (i = 1; i <= NF; i++) {
    if ($i in repeat_fields) {
        repeat[++nrepeats] = i
        repeat_look[i] = i
        rformats[i] = repeat_fmt[$i]
    }
    if ($i in col_names) {
        col_nums[++n] = i
        col_look[i] = i
        formats[i] = forms[$i]
    }
}
# print the header line
for (i = 1; i <= nrepeats; i++) {
    f = rformats[repeat[i]]
    sub("d", "s", f)
    gsub("B", " ", f)
    printf f, $repeat[i]
}
for (i = 1; i <= ncols; i++) {
    f = formats[col_nums[i]]
    sub("d", "s", f)
    gsub("B", " ", f)
    printf f, $col_nums[i]
}
printf "\n"
next
}

{
for (i = 1; i <= NF; i++) {
    if (i in repeat_look) {
        f = rformats[i]
        gsub("B", " ", f)
        repeat_out = repeat_out sprintf(f, $i)

    }
    if (i in col_look) {
        f = formats[i]
        gsub("B", " ", f)
        out = out sprintf(f, $i)
        coln++
    }
    if (coln == ncols) {
        print repeat_out out
        out = ""
        coln = 0
    }
}
repeat_out = ""
}

Output

id quantitycolourshape
1 10bluesquare
1 redtrianglepink
2 circle12yellow
2 pentagonorangerectangle

My apologies for not including all info about the actual file earlier. I did this only for simplicity, but it did not capture all my requirements.

In my actual file I am looking to transpose fields n_cell and n_bsc for NODE SITE CHILD

NODE SITE CHILD n_cell n_bsc

Here is a link to the actual file I am working on

解决方案

<>;
print("id colour shape size\n");

while (<>) {
   my @combined_fields = split;
   my $id = shift(@combined_fields);
   while (@combined_fields) {
       my @fields = ( $id, splice(@combined_fields, 0, 3) );
       print(join(' ', @fields), "\n");
   }
}

这篇关于移调使用AWK或Perl的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆