健壮的方法以编程方式构建SQL查询 [英] Robust approach for building SQL queries programmatically
问题描述
我不得不求助于ORM不足的原始SQL(使用Django 1.7).问题在于大多数查询最终都具有80-90%的相似性.我想不出一个强大的&建立查询而又不违反可重用性的安全方法.
I have to resort to raw SQL where the ORM is falling short (using Django 1.7). The problem is that most of the queries end up being 80-90% similar. I cannot figure out a robust & secure way to build queries without violating re-usability.
字符串连接是唯一的出路,即使用if-else
条件构建无参数查询字符串,然后使用准备好的语句安全地包含参数(以避免SQL注入).我想采用一种简单的方法来为我的项目模板SQL,而不是重新发明一个小型ORM.
Is string concatenation the only way out, i.e. build parameter-less query strings using if-else
conditions, then safely include the parameters using prepared statements (to avoid SQL injection). I want to follow a simple approach for templating SQL for my project instead of re-inventing a mini ORM.
例如,考虑以下查询:
SELECT id, name, team, rank_score
FROM
( SELECT id, name, team
ROW_NUMBER() OVER (PARTITION BY team
ORDER BY count_score DESC) AS rank_score
FROM
(SELECT id, name, team
COUNT(score) AS count_score
FROM people
INNER JOIN scores on (scores.people_id = people.id)
GROUP BY id, name, team
) AS count_table
) AS rank_table
WHERE rank_score < 3
我该怎么办
a)在people
或
上添加可选的WHERE
约束
b)将INNER JOIN
更改为LEFT OUTER
或
c)将COUNT
更改为SUM
或
d)完全跳过OVER / PARTITION
子句?
a) add optional WHERE
constraint on people
or
b) change INNER JOIN
to LEFT OUTER
or
c) change COUNT
to SUM
or
d) completely skip the OVER / PARTITION
clause?
推荐答案
更好的查询
对于初学者来说,您可以修复语法,简化和澄清很多事情:
Better query
For starters you can fix the syntax, simplify and clarify quite a bit:
SELECT *
FROM (
SELECT p.person_id, p.name, p.team, sum(s.score)::int AS score
,rank() OVER (PARTITION BY p.team
ORDER BY sum(s.score) DESC)::int AS rnk
FROM person p
JOIN score s USING (person_id)
GROUP BY 1
) sub
WHERE rnk < 3;
-
以我更新的表格布局为基础.参见下面的小提琴.
Building on my updated table layout. See fiddle below.
您不需要其他子查询.窗口函数是在聚集函数之后 执行的,因此您可以像演示的那样嵌套它.
You do not need the additional subquery. Window functions are executed after aggregate functions, so you can nest it like demonstrated.
在谈论排名"时,您可能要使用
rank()
而不是row_number()
.While talking about "rank", you probably want to use
rank()
, notrow_number()
.假设
people.people_id
是PK,则可以简化GROUP BY
.Assuming
people.people_id
is the PK, you can simplifyGROUP BY
.确保对所有可能不明确的列名进行表限定
Be sure to table-qualify all column names that might be ambiguous
然后,我将编写一个plpgsql函数,该函数接受可变部分的参数. 实施您的观点的
a
-c
.d
不清楚,留给您补充.Then I would write a plpgsql function that takes parameters for your variable parts. Implementing
a
-c
of your points.d
is unclear, leaving that for you to add.CREATE OR REPLACE FUNCTION f_demo(_agg text DEFAULT 'sum' , _left_join bool DEFAULT FALSE , _where_name text DEFAULT NULL) RETURNS TABLE(person_id int, name text, team text, score int, rnk int) AS $func$ DECLARE _agg_op CONSTANT text[] := '{count, sum, avg}'; -- allowed functions _sql text; BEGIN -- assert -- IF _agg ILIKE ANY (_agg_op) THEN -- all good ELSE RAISE EXCEPTION '_agg must be one of %', _agg_op; END IF; -- query -- _sql := format(' SELECT * FROM ( SELECT p.person_id, p.name, p.team, %1$s(s.score)::int AS score ,rank() OVER (PARTITION BY p.team ORDER BY %1$s(s.score) DESC)::int AS rnk FROM person p %2$s score s USING (person_id) %3$s GROUP BY 1 ) sub WHERE rnk < 3 ORDER BY team, rnk' , _agg , CASE WHEN _left_join THEN 'LEFT JOIN' ELSE 'JOIN' END , CASE WHEN _where_name <> '' THEN 'WHERE p.name LIKE $1' ELSE '' END ); -- debug -- quote when tested ok -- RAISE NOTICE '%', _sql; -- execute -- unquote when tested ok RETURN QUERY EXECUTE _sql USING _where_name; -- $1 END $func$ LANGUAGE plpgsql;
致电:
SELECT * FROM f_demo(); SELECT * FROM f_demo('sum', TRUE, '%2'); SELECT * FROM f_demo('avg', FALSE); SELECT * FROM f_demo(_where_name := '%1_'); -- named param
-
您需要对PL/pgSQL有深入的了解.否则,有太多需要解释的地方.您可以在几乎涵盖了答案中的每个细节.
You need a firm understanding of PL/pgSQL. Else, there is just too much to explain. You'll find related answers here on SO under plpgsql for practically every detail in the answer.
所有参数均已安全处理,无法进行SQL注入.更多:
All parameters are treated safely, no SQL injection possible. More:
尤其要注意,如何在查询字符串中有条件地(在传递
_where_name
时)使用位置参数$1
添加WHERE
子句.该值通过Note in particular, how a
WHERE
clause is added conditionally (when_where_name
is passed) with the positional parameter$1
in the query sting. The value is passed toEXECUTE
as value with theUSING
clause. No type conversion, no escaping, no chance for SQL injection. Examples:- Row expansion via "*" is not supported here
- SQL state: 42601 syntax error at or near "11"
- Refactor a PL/pgSQL function to return the output of various SELECT queries
对函数参数使用
DEFAULT
值,因此您可以随意提供任何内容或不提供任何内容.更多:Use
DEFAULT
values for function parameters, so you are free to provide any or none. More:函数
format()
有助于以安全干净的方式构建复杂的动态SQL字符串.The function
format()
is instrumental for building complex dynamic SQL strings in a safe and clean fashion.这篇关于健壮的方法以编程方式构建SQL查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
-