Databricks:SQL 查询的等效代码 [英] Databricks : Equivalent code for SQL query

查看:18
本文介绍了Databricks:SQL 查询的等效代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为查询寻找等效的数据块代码.我添加了一些示例代码和预期的代码,但特别是我正在 Databricks 中寻找 query 的等效代码.目前我被困在 CROSS APPLY STRING SPLIT 部分.

I'm looking for the equivalent databricks code for the query. I added some sample code and the expected as well, but in particular I'm looking for the equivalent code in Databricks for the query. For the moment I'm stuck on the CROSS APPLY STRING SPLIT part.

示例 SQL 数据:

    CREATE TABLE FactTurnover
    (
    ID INT,
    SalesPriceExcl NUMERIC (9,4),
  Discount VARCHAR(100)
  )
 INSERT INTO FactTurnover
 VALUES 
   (1, 100, '10'),
   (2, 39.5877, '58, 12'),
   (3, 100, '50, 10, 15'),
   (4, 100, 'B')

查询:

    ;WITH CTE AS
    (
     SELECT Id, SalesPriceExcl, 
         CASE WHEN value = 'B' THEN 0
         ELSE CAST(value as int) END AS Discount
     From FactTurnover
     CROSS APPLY STRING_SPLIT(Discount, ',')
     )
     SELECT Id,  
       Min(SalesPriceExcl) AS SalesPriceExcludingDiscount,
       EXP(SUM(LOG((100 - Discount) / 100.0))) As TotalDiscount,
       Cast(EXP(SUM(LOG((100 - Discount) / 100.0))) * 
            MIN(SalesPriceExcl) As Numeric(9,2))
        PriceAfterDiscount
     FROM CTE
     GROUP BY ID

预期结果:

| Id | SalesPriceExcludingDiscount |       TotalDiscount | PriceAfterDiscount |

|----|-----------------------------|---------------------|--------------------|

|  1 |                         100 |                 0.9 |                 90 |

|  2 |                     39.5877 | 0.36960000000000004 |              14.63 |

|  3 |                         100 | 0.38250000000000006 |              38.25 |

|  4 |                         100 |                   1 |                100 |

推荐答案

使用 SPLIT 将逗号分隔的字符串转换为数组,然后使用 LATERAL VIEW>EXPLODE 对该数组的元素进行操作.大致等效的语法(包括 CTE)是:

Use SPLIT to convert the comma-separated string to an array then use LATERAL VIEW and EXPLODE to do operations on the elements of that array. The roughly equivalent syntax (including CTEs) is:

%sql
--SELECT * FROM FactTurnover;

WITH cte AS
(
SELECT *
FROM
  (
  SELECT Id, SalesPriceExcl, SPLIT ( Discount, ',' ) AS discountArray
  FROM FactTurnover
  ) x
  LATERAL VIEW EXPLODE ( discountArray ) x AS xdiscount
)
SELECT 
  Id,
  MIN(SalesPriceExcl) AS SalesPriceExcludingDiscount,
  EXP ( SUM( LOG( ( 100 - xdiscount ) / 100.00 ) ) ) AS TotalDiscount
FROM cte
GROUP BY Id
ORDER BY Id

如果你觉得勇敢,你也可以使用 高阶函数.我在下面包含了两个示例.我会说这些更难调试,您可能应该在性能方面尝试它们,这取决于您对什么感到满意:

If you are feeling brave, you could also do this using higher order functions. I've included two examples below. I would say these are harder to debug and you should probably try them performance-wise, it depends what you're comfortable with:

%sql
-- Convert Discount text column to array with SPLIT function and filter out value 'B' from the array
;WITH filterB AS (
SELECT *, FILTER ( SPLIT ( Discount, ',' ), x -> x != 'B' ) discountArray
FROM FactTurnover
), cte1 AS (
-- Do initial calcs on array
SELECT 
  Id,
  TRANSFORM ( discountArray, discountArray -> LOG( ( 100 - discountArray ) / 100.00 ) ) discountArray2
FROM filterB
)
SELECT
  Id,
  EXP( AGGREGATE ( discountArray2, CAST( 0 AS DOUBLE ), ( x, y ) -> x + y ) ) AS x
FROM cte1;

-- all in one example
SELECT 
    Id,
    EXP( AGGREGATE( TRANSFORM( FILTER ( SPLIT ( Discount, ',' ), x -> x != 'B' ), y -> LOG( ( 100 - y ) / 100.00 ) ), CAST( 0 AS DOUBLE ), ( z, a ) -> z + a ) )
    AS final
FROM FactTurnover
ORDER BY Id

这篇关于Databricks:SQL 查询的等效代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆