函数内部动态查询的 format() 有多安全? [英] How secure is format() for dynamic queries inside a function?

查看:65
本文介绍了函数内部动态查询的 format() 有多安全?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在阅读了 Postgres 手册和这里的许多帖子之后,我写了这个函数来解决我在安全方面的所有发现.效果很好,可以满足我的所有需求.

After reading Postgres manual and many posts here, I wrote this function tacking in mind all I found regarding security. It works great and does everything I looked for.

  • 接受一个json,每个键都有一个数组[visible, filter, arg1, optional arg2]

SELECT public.goods__list_json2('{"name": [true, 7, "Ad%"], "category": [true], "stock": [false, 4, 0]}', 20, 0);

SELECT public.goods__list_json2('{"name": [true, 7, "Ad%"], "category": [true], "stock": [false, 4, 0]}', 20, 0);

  • 返回一个包含请求数据的 json 数组.
  • [{"name": "Adventures of TRON", "category": "Atari 2600"}, {"name": "Adventure", "category": "Atari 2600";}]

    [{"name": "Adventures of TRON", "category": "Atari 2600"}, {"name": "Adventure", "category": "Atari 2600"}]

    我的问题是,我如何才能真正确定当我使用用户输入参数创建查询时,将它们作为 %L 传递给格式是注入安全的?根据我的 db 设计,一切都是通过函数完成的,其中大部分作为安全定义器运行,只允许某些角色执行它们.

    My question is, how could I really be sure that when I create the query using user input arguments, passing them as %L with format is injection safe? By my db design, all is done through functions, running most of them as security definer only allowing certain roles to execute them.

    为了安全起见,我的目的是将旧函数转换为这种动态逻辑,省去自己编写大量代码行来创建新的或特定的查询.

    Being secure, my intention is to convert old functions to this dynamic logic and save myself to write a lot of lines of code creating new or specific queries.

    我真的很感激有经验的 Postgres 开发人员可以就此给我建议.

    I would really appreciate a experienced Postgres developer could give me an advice on this.

    我使用的是 Postgres 13.

    I'm using Postgres 13.

    CREATE FUNCTION public.goods__list_json (IN option__j jsonb, IN limit__i integer, IN offset__i integer)
        RETURNS jsonb
        LANGUAGE plpgsql
        VOLATILE 
        STRICT
        SECURITY DEFINER
        COST 1
        AS $$
    DECLARE
        table__v varchar := 'public.goods_full';
      column__v varchar[] := ARRAY['id', 'id__category', 'category', 'name', 'barcode', 'price', 'stock', 'sale', 'purchase'];
        filter__v varchar[] := ARRAY['<', '>', '<=', '>=', '=', '<>', 'LIKE', 'NOT LIKE', 'ILIKE', 'NOT ILIKE', 'BETWEEN', 'NOT BETWEEN'];
        select__v varchar[];
        where__v varchar[];
      sql__v varchar;
        key__v varchar;
        format__v varchar;
        temp__v varchar;
        temp__i integer;
        betw__v varchar;
        result__j jsonb;
    BEGIN
        FOR key__v IN SELECT jsonb_object_keys(option__j) LOOP
            IF key__v = ANY(column__v) THEN
                IF (option__j->key__v->0)::bool THEN
                    select__v := array_append(select__v, key__v);
                END IF;
                temp__i := (option__j->key__v->1)::int;
                IF temp__i > 0 AND temp__i <= array_length(filter__v, 1) THEN
                    temp__v := (option__j->key__v->>2)::varchar;
                    IF temp__i >= 11 THEN
                        betw__v := (option__j->key__v->>3)::varchar;
                        format__v := format('%I %s %L AND %L', key__v, filter__v[temp__i], temp__v, betw__v);
                    ELSE
                        format__v := format('%I %s %L', key__v, filter__v[temp__i], temp__v);
                    END IF;
                    where__v := array_append(where__v, format__v);
                END IF;
            END IF;
        END LOOP;
        sql__v := 'SELECT jsonb_agg(t) FROM (SELECT '
            || array_to_string(select__v, ', ')
            || format(' FROM %s WHERE ', table__v)
            || array_to_string(where__v, ' AND ')
            || format(' OFFSET %L LIMIT %L', offset__i, limit__i)
            || ') t';
        RAISE NOTICE 'SQL: %', sql__v;
        EXECUTE sql__v INTO result__j;
        RETURN result__j;
    END;
    $$;
    

    推荐答案

    警告:这种风格在SECURITY DEFINER 函数中使用动态 SQL 可以优雅和方便.但不要过度使用它.不要以这种方式嵌套多层函数:

    A word of warning: this style with dynamic SQL in SECURITY DEFINER functions can be elegant and convenient. But don't overuse it. Do not nest multiple levels of functions this way:

    • 该样式比普通 SQL 更容易出错.
    • 带有 SECURITY DEFINER 的上下文切换有一个价格标签.
    • 使用 EXECUTE 的动态 SQL 无法保存和重用查询计划.
    • 没有函数内联".
    • 而且我根本不想将它用于大表上的大查询.增加的复杂性可能是性能障碍.例如:以这种方式禁用查询计划的并行性.
    • The style is much more error prone than plain SQL.
    • The context switch with SECURITY DEFINER has a price tag.
    • Dynamic SQL with EXECUTE cannot save and reuse query plans.
    • No "function inlining".
    • And I'd rather not use it for big queries on big tables at all. The added sophistication can be a performance barrier. Like: parallelism is disabled for query plans this way.

    也就是说,你的函数看起来不错,我看不到 SQL 注入.format() 被证明可以很好地连接并引用动态 SQL 的值和标识符.相反,您可能会删除一些冗余以降低成本.

    That said, your function looks good, I see no way for SQL injection. format() is proven good to concatenate and quote values and identifiers for dynamic SQL. On the contrary, you might remove some redundancy to make it cheaper.

    函数参数offset__ilimit__iinteger.SQL 注入是不可能通过整数进行的,实际上不需要引用它们(即使 SQL 允许 LIMITOFFSET 的引用字符串常量).所以只是:

    Function parameters offset__i and limit__i are integer. SQL injection is impossible through integer numbers, there is really no need to quote them (even though SQL allows quoted string constants for LIMIT and OFFSET). So just:

    format(' OFFSET %s LIMIT %s', offset__i, limit__i)
    

    此外,在验证每个 key__v 都在您的合法列名中之后 - 虽然这些都是合法的,未加引号的列名 - 没有必要通过 %I.可以只是 %s

    Also, after verifying that each key__v is among your legal column names - and while those are all legal, unquoted column names - there is no need to run it through %I. Can just be %s

    我宁愿使用 text 而不是 varchar.没什么大不了的,但 text 是首选"字符串类型.

    I'd rather use text instead of varchar. Not a big deal, but text is the "preferred" string type.

    相关:

    COST 1 似乎太低了.手册:

    COST execution_cost

    给出估计执行成本的正数函数,以 为单位cpu_operator_cost.如果函数返回一个集合,这是每个返回行的成本.如果费用不指定,假设 1 个单位用于 C 语言和内部函数,和 100 个单位用于所有其他语言的功能.较大的值导致计划者试图避免更频繁地评估函数不必要的.

    A positive number giving the estimated execution cost for the function, in units of cpu_operator_cost. If the function returns a set, this is the cost per returned row. If the cost is not specified, 1 unit is assumed for C-language and internal functions, and 100 units for functions in all other languages. Larger values cause the planner to try to avoid evaluating the function more often than necessary.

    除非您更清楚,否则将 COST 保留为默认的 100.

    Unless you know better, leave COST at its default 100.

    整个循环可以用一个 SELECT 语句代替.应该明显更快.PL/pgSQL 中的赋值相对昂贵.像这样:

    The whole looping can be replaced with a single SELECT statement. Should be noticeably faster. Assignments are comparatively expensive in PL/pgSQL. Like this:

    CREATE OR REPLACE FUNCTION goods__list_json (_options json, _limit int = NULL, _offset int = NULL, OUT _result jsonb)
        RETURNS jsonb
        LANGUAGE plpgsql SECURITY DEFINER AS
    $func$
    DECLARE
       _tbl  CONSTANT text   := 'public.goods_full';
       _cols CONSTANT text[] := '{id, id__category, category, name, barcode, price, stock, sale, purchase}';   
       _oper CONSTANT text[] := '{<, >, <=, >=, =, <>, LIKE, "NOT LIKE", ILIKE, "NOT ILIKE", BETWEEN, "NOT BETWEEN"}';
       _sql           text;
    BEGIN
       SELECT concat('SELECT jsonb_agg(t) FROM ('
               , 'SELECT ' || string_agg(t.col, ', '  ORDER BY ord) FILTER (WHERE t.arr->>0 = 'true')
                                                   -- ORDER BY to preserve order of objects in input
               , ' FROM '  || _tbl
               , ' WHERE ' || string_agg (
                                 CASE WHEN (t.arr->>1)::int BETWEEN  1 AND 10 THEN
                                    format('%s %s %L'       , t.col, _oper[(arr->>1)::int], t.arr->>2)
                                      WHEN (t.arr->>1)::int BETWEEN 11 AND 12 THEN
                                    format('%s %s %L AND %L', t.col, _oper[(arr->>1)::int], t.arr->>2, t.arr->>3)
                                   -- ELSE NULL  -- = default - or raise exception for illegal operator index?
                                 END
                               , ' AND '  ORDER BY ord) -- ORDER BY only cosmetic
               , ' OFFSET ' || _offset  -- SQLi-safe, no quotes required
               , ' LIMIT '  || _limit   -- SQLi-safe, no quotes required
               , ') t'
              )
       FROM   json_each(_options) WITH ORDINALITY t(col, arr, ord)
       WHERE  t.col = ANY(_cols)        -- only allowed column names - or raise exception for illegal column?
       INTO   _sql;
    
       IF _sql IS NULL THEN
          RAISE EXCEPTION 'Invalid input resulted in empty SQL string! Input: %', _options;
       END IF;
       
       RAISE NOTICE 'SQL: %', _sql;
       EXECUTE _sql INTO _result;
    END
    $func$;
    

    db<>fiddle 这里

    更短、更快并且仍然安全地对抗 SQLi.

    Shorter, faster and still safe against SQLi.

    仅在语法或防止 SQL 注入需要时才添加引号.烧毁仅过滤值.列名和运算符根据允许选项的硬连线列表进行验证.

    Quotes are only added where necessary for syntax or to defend against SQL injection. Burns down to filter values only. Column names and operators are verified against the hard-wired list of allowed options.

    输入是 json 而不是 jsonb.json 中保留了对象的顺序,因此您可以确定 SELECT 列表中的列顺序(这是有意义的)和 WHERE 条件(这纯粹是化妆品).该函数现在观察到两者.

    Input is json instead of jsonb. Order of objects is preserved in json, so you can determine the sequence of columns in the SELECT list (which is meaningful) and WHERE conditions (which is purely cosmetic). The function observes both now.

    输出 _result 仍然是 jsonb.使用 OUT 参数代替变量.这完全是可选的,只是为了方便.(不需要明确的 RETURN 语句.)

    Output _result is still jsonb. Using an OUT parameter instead of the variable. That's totally optional, just for convenience. (No explicit RETURN statement required.)

    注意 concat() 的策略性使用以静默忽略 NULL 和连接运算符 || 以便 NULL 使连接的字符串为 NULL.这样,FROMWHERELIMITOFFSET 只在需要的地方插入.SELECT 语句在没有任何一个的情况下也能工作.空的 SELECT 列表(也是合法的,但我认为不需要)会导致语法错误.都是有意的.
    仅将 format() 用于 WHERE 过滤器,以方便并引用值.见:

    Note the strategic use of concat() to silently ignore NULL and the concatenation operator || so that NULL makes the concatenated string NULL. This way, FROM, WHERE, LIMIT, and OFFSET are only inserted where needed. A SELECT statement works without either of those. An empty SELECT list (also legal, but I suppose unwanted) results in a syntax error. All intended.
    Using format() only for WHERE filters, for convenience and to quote values. See:

    函数不再是 STRICT._limit_offset 有默认值NULL,所以只需要第一个参数_options._limit_offset 可以为 NULL 或省略,然后每个都从语句中剥离.

    The function isn't STRICT anymore. _limit and _offset have default value NULL, so only the first parameter _options is required. _limit and _offset can be NULL or omitted, then each is stripped from the statement.

    使用 text 而不是 varchar.

    实际上创建了常量变量CONSTANT(主要用于文档).

    Made constant variables actually CONSTANT (mostly for documentation).

    除此之外,该函数执行您原来的功能.

    Other than that the function does what your original does.

    这篇关于函数内部动态查询的 format() 有多安全?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆