将sql逻辑移动到后端-bash [英] moving sql logic to backend - bash

查看:61
本文介绍了将sql逻辑移动到后端-bash的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一种sql逻辑正在移到后端,我需要使用shell脚本生成报告.为了便于理解,我将其简化如下.

One of the sql logic is moving to backend and I need to generate a report using shell scripting. For understanding, I'm making it simple as follows.

我的输入文件-sales.txt(编号,价格,月份)

My input file - sales.txt (id, price, month)

101,50,2019-10
101,80,2020-08
101,80,2020-10
201,100,2020-09
201,350,2020-10

每个ID的输出应为6个月的窗口,例如t1 = 2020-07和t2 = 2020-12

The output should be for 6 months window for each id e.g t1=2020-07 and t2=2020-12

101,50,2020-07
101,80,2020-08
101,80,2020-09
101,80,2020-10
101,80,2020-11
101,80,2020-12
201,100,2020-09
201,350,2020-10
201,350,2020-11
201,350,2020-12

对于ID为 101 的ID,尽管没有2020-07的条目,但它应取自销售文件中可用的上个月的前一个值.因此,将2019-10年度的价格= 50用于2020-07年度.

For id 101, though there is no entry for 2020-07, it should take from the immediate previous month value that is available in the sales file. So the price=50 from 2019-10 is used for 2020-07.

对于 201 ,第一个条目本身是从2020-09开始的,因此2020-08和2020-07不适用.只要有差距,就应该传播前一个月的值.

For 201, the first entry itself is from 2020-09, so 2020-08 and 2020-07 are not applicable. Wherever there are gaps the immediate previous month value should be propagated.

我正在尝试使用awk解决此问题,我正在创建一个可重复使用的脚本util.awk,如下所示要生成缺少的值,请将其传递给sort命令,然后再次使用util.awk作为最终输出.

I'm trying to use awk to solve this problem, I'm creating a reusable script util.awk like below to generate the missing values, pipe it to sort command and then again use the util.awk for final output.

util.awk

function get_month(a,b,t1) { return strftime("%Y%m",mktime(a " " b t1)) } 
BEGIN { ss=" 0 0 0 "; ts1=" 1 " ss; ts2=" 35 " ss ; OFS="," ; x=1 } 
{ 
  tsc=get_month($3,$4,ts1);
  if ( NR>1 && $1==idp )
  {
  if( tsc == tsp) { print $1,$2,get_month($3,$4,ts1); x=0  }
  else { for(i=tsp; i < tsc; i=get_month(j1,j2,i) )  
         { 
       j1=substr(i,1,4); j2=substr(i,5,2); 
           print $1,tpr,i;
         }
       }
   }

  tsp=get_month($3,$4,ts2);  
  idp=$1;
  tpr=$2;
  if(x!=0) print $1,$2,tsc
  x=1;
  
}

但是它无限地运行 awk -F"[,-]";-f utils.awk sales.txt

尽管我在awk中尝试过,但我也欢迎在bash环境中也可以使用的其他答案.

Though I tried in awk, I welcome other answers as well that would work in bash environment.

推荐答案

总体计划:

  • 假设: sales.txt 已按第一列(数字方式)进行排序
  • 用户提供要显示的最小->最大日期范围( awk 变量 mindt maxdt )
  • 对于一个独特的 id 值,我们会将所有价格和日期加载到一个数组中( prices [] )
  • 日期将用作关联数组的索引以存储价格( prices [YYYY-MM] )
  • 一旦我们读取了给定 id ...
  • 的所有记录,
  • 按索引对 prices [] 数组进行排序(即,按 YYYY-MM 进行排序)
  • 找到最大日期的价格小于 mindt (另存为 prevprice )
  • 对于 mindt maxdt (含)之间的每个日期,如果我们有价格,则将其显示(并另存为 prevprice )...
  • 如果我们没有价格,但是我们有 prevprice ,则使用此 prevprice 作为当前日期的 price (即,用之前的价格填补空白)
  • assumption: sales.txt is already sorted (numerically) by the first column
  • user provides the min->max date range to be displayed (awk variables mindt and maxdt)
  • for a distinct id value we'll load all prices and dates into an array (prices[])
  • dates will be used as the indices of an associative array to store prices (prices[YYYY-MM])
  • once we've read all records for a given id ...
  • sort the prices[] array by the indices (ie, sort by YYYY-MM)
  • find the price for the max date less than mindt (save as prevprice)
  • for each date between mindt and maxdt (inclusive), if we have a price then display it (and save as prevprice) else ...
  • if we don't have a price but we do have a prevprice then use this prevprice as the current date's price (ie, fill the gap with the previous price)

一个(GNU) awk 想法:

mindate='2020-07'
maxdate='2020-12'

awk -v mindt="${mindate}" -v maxdt="${maxdate}" -v OFS=',' -F',' ' 

# function to add "months" (number) to "indate" (YYYY-MM)

function add_month(indate,months) {

    dhms="1 0 0 0"                                     # default day/hr/min/secs
    split(indate,arr,"-")
    yr=arr[1]
    mn=arr[2]

    return strftime("%Y-%m", mktime(arr[1]" "(arr[2]+months)" "dhms))
}

# function to print the list of prices for a given "id"

function print_id(id) {

    if ( length(prices) == 0 )                         # if prices array is empty then do nothing (ie, return)
       return

    PROCINFO["sorted_in"]="@ind_str_asc"               # sort prices[] array by index in ascending order

    for ( i in prices )                                # loop through indices (YYYY-MM)
        { if ( i < mindt )                             # as long as less than mindt
             prevprice=prices[i]                       # save the price
          else
             break                                     # no more pre-mindt indices to process
        }

    for ( i=mindt ; i<=maxdt ; i=add_month(i,1) )     # for our mindt - maxdt range
        { if ( !(i in prices) && prevprice )          # if no entry in prices[], but we have a prevprice, then ...
             prices[i]=prevprice                      # set prices[] to prevprice (ie, fill the gap)

          if ( i in prices )                          # if we have an entry in prices[] then ...
             { prevprice=prices[i]                    # update prevprice (for filling future gap) and ...
               print id,prices[i],i                   # print our data to stdout
             }
        }
}

BEGIN { split("",prices) }                             # pre-declare prices as an array

previd != $1 { print_id(previd)                        # when id changes print the prices[] array, then ...
               previd=$1                               # reset some variables for processing of the next id and ...
               prevprice=""
               delete prices                           # delete the prices[] array
             }

             { prices[$3]=$2 }                         # for the current record create an entry in prices[]

END   { print_id(previd) }                             # flush the last set of prices[] to stdout
' sales.txt

注意:这假设 sales.txt 是按第一个字段(数字方式)排序的;如果不正确,则最后一行应更改为'<(sort -n sales.txt)

NOTE: This assumes sales.txt is sorted (numerically) by the first field; if this is not true then the last line should be changed to ' <(sort -n sales.txt)

这将生成:

101,50,2020-07
101,80,2020-08
101,80,2020-09
101,80,2020-10
101,80,2020-11
101,80,2020-12
201,100,2020-09
201,350,2020-10
201,350,2020-11
201,350,2020-12

这篇关于将sql逻辑移动到后端-bash的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆