AWK code。与关联数组 - 数组似乎并不稀少,但没有错误 [英] Awk code with associative arrays -- array doesn't seem populated, but no error

查看:173
本文介绍了AWK code。与关联数组 - 数组似乎并不稀少,但没有错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


  1. 问:为什么它似乎date_list [D]。和isin_list [I]没有得到填充,在下面的code段


  2. AWK code(上GNU-AWK一个双赢-7计算机上)

      BEGIN {FS =,}#这SEBI数据集有逗号分隔的字段(NSE快照将管道分开)#更新DATE($ 10),firm_ISIN($ 9),交易所($ 12),和FII_ID($ 5)名单。
    ($ 17〜/ _EQ \\> /){
        如果(日期[$ 10] ++ == 0)date_list并[d +] = $ 10; #显示日期,以便在原始数据
        如果(ISIN [$ 9] ++ == 0)isin_list [我++] = $ 9; #ISINs在原始数据的显示顺序
        打印$ 10日[$ 10,$ 9 ISIN [$ 9],date_list [D] D,isin_list [我],我
    }


  3. 输入数据

    <$p$p><$c$c>49290,C198962542782200306,6/30/2003,433581,F5811773991200306,S5405611832200306,B5086397478200306,NESTLE印度LTD.,INE239A01016,6/27/2003,1,E9035083824200306,REG_DL_STLD_02,591.13,5655,3342840.15,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
    49291,C198962542782200306,6/30/2003,433563,F6292896459200306,S6344227311200306,B6110521493200306,GRASIM INDUSTRIES LTD.,INE047A01013,6/27/2003,1,E9035083824200306,REG_DL_STLD_02,495.33,3700,1832721,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
    49292,C198962542782200306,6/30/2003,433681,F6513202607200306,S1724027402200306,B6372023178200306,HDFC银行LTD,INE040A01018,6/26/2003,1,E745964372424200306,REG_DL_STLD_02,242,2600,629200,REG_DL_INSTR_EQ,REG_DL_DLAY_D,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
    49293,C7885768925200306,6 / 30 / 2003,48128,F4406661052200306,S7376401565200306,B4576522576200306,马鲁蒂UDYOG Limited,INE585B01010,6/28/2003,3,E912851176274200306,REG_DL_STLD_04,125,44600,5575000,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
    49294,C7885768925200306,6 / 30 / 2003,48129,F4500260787200306,S1312094035200306,B4576522576200306,马鲁蒂UDYOG Limited,INE585B01010,6/28/2003,4,E912851176274200306,REG_DL_STLD_04,125,445600,55700000,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
    49295,C7885768925200306,6 / 30 / 2003,48130,F6425024637200306,S2872499118200306,B4576522576200306,马鲁蒂UDYOG Limited,INE585B01010,6/28/2003,3,E912851176274200306,REG_DL_STLD_04,125,48000,6000000,REG_DL_INSTR_EU,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00


  4. 输出我收到

      2003年6月27日1 INE239A01016 1 1 1
    2003年6月27日2 INE047A01013 1 1 2
    2003年6月26日1 INE040A01018 1 2 3
    2003年6月28日1 INE585B01010 1 3 4
    2003年6月28日2 INE585B01010 2 3 4


  5. 期望的输出

    至于我可以告诉大家,打印正确打印出来(我)$ 10(日)(二)日期[$ 10),每个日期(III)$ 9(公司-ID叫ISIN计数)(四

    )ISIN [$ 10],对于每个ISIN(ⅴ)D(指数date_list的,独特的日期数)和(vi)I(指数isin_list的,独特ISINs数)的计数。我也应该得到两列 - 列5和低于7 - 为date_list [D]。和isin_list [I],这将有看起来像$ 10和$ 9个值

      2003年6月27日1 INE239A01016 1 2003年6月27日1 INE239A01016 1
    2003年6月27日2 INE047A01013 1 2003年6月27日1 INE047A01013 2
    2003年6月26日1 INE040A01018 1 2003年6月26日2 INE040A01018 3
    2003年6月28日1 INE585B01010 1 2003年6月28日3 INE585B01010 4
    2003年6月28日2 INE585B01010 2 2003年6月28日3 INE585B01010 4



解决方案

实际code我现在用的就是

  {如果(日期[$ 10] ++ == 0)date_list并[d +] = $ 10;
     如果(ISIN [$ 9] ++ == 0)isin_list [我++] = $ 9;}
($ 11日〜/ 1 | 2 | 3 | 5 | 9 | 1 [24] /)){++ BNR [$ 10,$ 9,$ 12 $ 5]}
END {{为(U = 0; U&LT; D​​组;ü++)
      {为(V = 0; V族I; v ++)
      {如果(BNR [date_list [U],isin_list [V]]大于0)
               BR = BNR [date_list [U],isin_list [V]
              {打印(date_list [U],isin_list [V],BR}}}}}

非常感谢大家。

  1. Question: Why does it seem that date_list[d] and isin_list[i] are not getting populated, in the code segment below?

  2. AWK Code (on GNU-AWK on a Win-7 machine)

    BEGIN { FS = "," } # This SEBI  data set has comma-separated fields (NSE snapshots are pipe-separated)
    
    # UPDATE the lists for DATE ($10), firm_ISIN ($9), EXCHANGE ($12), and FII_ID ($5).
    ( $17~/_EQ\>/ )    {
        if (date[$10]++ == 0) date_list[d++] = $10;   # Dates appear in order in raw data
        if (isin[$9]++ == 0) isin_list[i++] = $9;     # ISINs appear out of order in raw data
        print $10, date[$10], $9, isin[$9], date_list[d], d, isin_list[i], i 
    }
    

  3. input data

    49290,C198962542782200306,6/30/2003,433581,F5811773991200306,S5405611832200306,B5086397478200306,NESTLE INDIA LTD.,INE239A01016,6/27/2003,1,E9035083824200306,REG_DL_STLD_02,591.13,5655,3342840.15,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
    49291,C198962542782200306,6/30/2003,433563,F6292896459200306,S6344227311200306,B6110521493200306,GRASIM INDUSTRIES LTD.,INE047A01013,6/27/2003,1,E9035083824200306,REG_DL_STLD_02,495.33,3700,1832721,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
    49292,C198962542782200306,6/30/2003,433681,F6513202607200306,S1724027402200306,B6372023178200306,HDFC BANK LTD,INE040A01018,6/26/2003,1,E745964372424200306,REG_DL_STLD_02,242,2600,629200,REG_DL_INSTR_EQ,REG_DL_DLAY_D,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
    49293,C7885768925200306,6/30/2003,48128,F4406661052200306,S7376401565200306,B4576522576200306,Maruti Udyog Limited,INE585B01010,6/28/2003,3,E912851176274200306,REG_DL_STLD_04,125,44600,5575000,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
    49294,C7885768925200306,6/30/2003,48129,F4500260787200306,S1312094035200306,B4576522576200306,Maruti Udyog Limited,INE585B01010,6/28/2003,4,E912851176274200306,REG_DL_STLD_04,125,445600,55700000,REG_DL_INSTR_EQ,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
    49295,C7885768925200306,6/30/2003,48130,F6425024637200306,S2872499118200306,B4576522576200306,Maruti Udyog Limited,INE585B01010,6/28/2003,3,E912851176274200306,REG_DL_STLD_04,125,48000,6000000,REG_DL_INSTR_EU,REG_DL_DLAY_P,DL_RPT_TYPE_N,DL_AMDMNT_DEL_00
    

  4. output that I am getting

    6/27/2003 1 INE239A01016 1  1  1
    6/27/2003 2 INE047A01013 1  1  2
    6/26/2003 1 INE040A01018 1  2  3
    6/28/2003 1 INE585B01010 1  3  4
    6/28/2003 2 INE585B01010 2  3  4
    

  5. Expected output

    As far as I can tell, the print is printing out correctly (i) $10 (the date) (ii) date[$10), the count for each date (iii) $9 (firm-ID called ISIN) (iv) isin[$9], the count for each ISIN (v) d (index of date_list, the number of unique dates) and (vi) i (index of isin_list, the number of unique ISINs). I should also get two more columns -- columns 5 and 7 below -- for date_list[d] and isin_list[i], which will have values that look like $10 and $9.

    6/27/2003 1 INE239A01016 1  6/27/2003 1 INE239A01016  1
    6/27/2003 2 INE047A01013 1  6/27/2003 1 INE047A01013  2
    6/26/2003 1 INE040A01018 1  6/26/2003 2 INE040A01018  3
    6/28/2003 1 INE585B01010 1  6/28/2003 3 INE585B01010  4
    6/28/2003 2 INE585B01010 2  6/28/2003 3 INE585B01010  4
    

解决方案

actual code I now use is

{    if (date[$10]++ == 0) date_list[d++] = $10;                 
     if (isin[$9]++ == 0) isin_list[i++] = $9;}        
( $11~/1|2|3|5|9|1[24]/ )) { ++BNR[$10,$9,$12,$5]}         
END { { for (u = 0; u < d; u++)  
      {for (v = 0; v < i; v++) 
      {    if  (BNR[date_list[u],isin_list[v]]>0) 
               BR=BNR[date_list[u],isin_list[v]] 
              { print(date_list[u], isin_list[v], BR}}}}}  

Thanks a lot to everyone.

这篇关于AWK code。与关联数组 - 数组似乎并不稀少,但没有错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆