更新多行时强制转换为NULL类型 [英] Casting NULL type when updating multiple rows

查看:95
本文介绍了更新多行时强制转换为NULL类型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我尝试同时更新许多行时遇到问题。



这是我使用的表和查询(为便于阅读而简化):



 创建表foo 

个整数,
x整数,
y integer

查询

  UPDATE foo SET x = tx,y = ty FROM 
(值(50,50,1),
(100 ,120,2))
AS t(x,y,pkid)WHERE foo.pkid = t.pkid

此查询工作完美,但是当我尝试执行所有 x y 值均为空的查询时,出现错误:



具有空值的查询

  UPDATE foo SET x = tx,y = ty从
(值(null,20,1),
(null,50,2))
AS t(x,y,pkid)在foo.pkid处= t.pkid

错误

 错误:列 x的类型为整数,但表达式的类型为typ e text 
第1行:UPDATE foo SET x = tx FROM

唯一的解决方法即将值(null,20,1)中的至少一个更改为(null:int,50,2)但我不能这样做,因为我有一个生成这些更新多行查询的函数,它对列类型一无所知。



这里最好的解决方案是什么?有多行更好的更新查询吗?是否有任何函数或语法,例如 AS t(x:gettype(foo.x),y:gettype(foo.y),pkid:gettype(foo.pkid))

解决方案

带有独立的 VALUES 表达式PostgreSQL不知道数据类型应该是。使用简单的数字文字,系统很乐于假定匹配类型。但是使用其他输入(如 NULL ),您将需要显式转换-正如您已经发现的那样。



您可以查询 pg_catalog (快速,但特定于PostgreSQL)或 information_schema (速度慢,但标准SQL)来查找



或者您可以使用这些简单的技巧之一(我为 last 保存了最好的东西) :



1。选择具有 LIMIT 0 的行,附加具有 UNION ALL

的行

  UPDATE foo f 
SET x = tx
,y = ty
FROM(
(SELECT pkid,x,y FROM foo LIMIT 0)-带有LIMIT
UNION ALL
SELECT 1,20,NULL的圆括号
UNION ALL
SELECT 2,50,NULL
)t-列名和类型已经定义
$ f $$$ f $$$$$$$$$$$$$$$$选择子查询:

 (SELECT x,y,pkid FROM foo LIMIT 0)

获取列的名称和类型,但 LIMIT 0 阻止它添加实际的行。随后的行被强制为现在定义良好的行类型-并立即检查它们是否与该类型匹配。



主要限制:使用单独的 SELECT 行,Postgres立即将输入文字转换为尽力而为类型。稍后当它尝试将类型转换为第一个 SELECT 的给定类型时,如果某些类型在假定类型和指定类型之间没有注册的分配,可能已经为时已晚。目标类型。示例文本-> 时间戳



专业版:

-最小开销。

-几行可读,简单,快速。

-您只需要知道相关的列名



Con:

-某些类型的类型解析可能会失败。

- UNION ALL SELECT VALUES 表达式要慢,因为您在测试中找到的内容。

-每行的详细语法。



2。每个列类型的 VALUES 表达式



  ... 
FROM(

((从foo LIMIT 0选择pkid)
,(从foo LIMIT 0选择x)
,(从foo LIMIT 0选择y)-分别获取每个col的类型
,(1,20,NULL)
,(2,50,NULL)
)t(pkid,x,y)-尚未定义的列名,仅类型。
...

VALUES中的第一行表达式是一行 NULL 值,该值定义所有后续行的类型。



Pro:

-比 1快。

-具有多​​列且只有很少几列的表的最短语法。

-您只需要知道表的相关列名。



Con:

-仅少量行的详细语法

-可读性较低(IMO)。



3。 VALUES 行类型的表达式



  UPDATE foo f 
SET x =(tr).x-使语法清晰的
,y =(tr).y
FROM(
VALUES
('(1,20,) ':: foo)-列必须按表
的默认顺序执行,('((2,50,)')-NULL
)t(r)的最后一个逗号之后没有- -行类型
的列名称WHERE f.pkid =(tr).pkid

您显然知道表名。如果您还知道列数及其顺序,则可以使用它。



对于PostgreSQL中的每个表,都会自动注册一个行类型。如果匹配表达式中的列数,则可以将其强制转换为表的行类型('(1,50,)':: foo ),从而分配列类型隐式。不要在逗号后面输入 NULL 值。为每个不相关的尾随列添加一个逗号。

在下一步中,您可以使用演示的语法访问各个列。有关手册中 字段选择的更多信息



或者您可以 添加 一行空值,并对实际数据使用统一的语法:

  ... 

((NULL :: foo))-NULL行values
,('((1,20,)')-所有
,('((2,50,)')
...
的统一ROW值语法

添加的行被排除在 WHERE 子句中code> UPDATE

出于其他目的,您可以在子查询中使用 OFFSET 1 消除添加的第一行。 。



Pro:

-最快(至少在我的测试中只有很少的行和列)。

-需要所有列的少数行或表的最短语法。
-您不必拼写表的列-所有列都会自动具有匹配的名称。



Con:

-用于从记录/行/复合类型中选择字段的语法不太为人所知。

-您需要按默认顺序知道相关列的数量和位置。



4。具有分解的行类型的 VALUES 表达式



类似于 3。 ,但具有标准语法中可分解的行:

  UPDATE foo f 
SET x = tx
,y = ty
FROM(
VALUES
(('((1,20,)':: foo)。*)-分解的行
,(2,50,NULL)
)t(pkid,x,y)-任意列名(我使它们匹配)
W.E.pkid = t.pkid;

或者,再次使用前导的NULL值:

  ... 

((NULL :: foo)。*)-空值行
,(1, 20,NULL)-所有
的统一语法,(2,50,NULL)
...

优点和缺点,类似于 3。 。但是语法更广为人知。

然后,您需要拼出列名(如果需要)。



5。 VALUES 表达式,其类型是从行类型中获取的



类似于 Unril评论,我们可以结合 2。 strong>和 4。 ,以仅提供列的子集:

 更新foo f 
SET(x,y)
=(tx,ty)-简写,请参见下面的
FROM(

((NULL: :foo).pkid,(NULL :: foo).x,(NULL :: foo).y)-列子集
,(1,20,NULL)
,(2,50 ,NULL)
)t(pkid,x,y)-任意列名(我使它们匹配)
W.E.ff.pkid = t.pkid;

优缺点,如 4。 em> ,但我们可以使用任何列子集,而不必知道完整的列表。



还显示 UPDATE 本身,对于包含许多列的情况很方便。相关:





4。和5.是我的最爱。


I have a problem when I try to update many rows at the same time.

Here is the table and query I use (simplified for better reading):

table

CREATE TABLE foo
(
    pkid integer,
    x integer,
    y integer
)

query

UPDATE foo SET x=t.x, y=t.y FROM
(VALUES (50, 50, 1),
        (100, 120, 2))
AS t(x, y, pkid) WHERE foo.pkid=t.pkid

This query works perfectly, but when I try to execute a query where all x or y values are null, I get an error:

query with nulls

UPDATE foo SET x=t.x, y=t.y FROM
(VALUES (null, 20, 1),
        (null, 50, 2))
AS t(x, y, pkid) WHERE foo.pkid=t.pkid

error

ERROR:  column "x" is of type integer but expression is of type text
LINE 1: UPDATE foo SET x=t.x FROM

The only way to fix that is to change at least one of the values (null, 20, 1) to (null:int, 50, 2) but I can't do that, since I have a function which generates these "update multiple rows" query and it doesn't know anything about the column types.

What's the best solution here? Is there any better update query for multiple rows? Is there any function or syntax like AS t(x:gettype(foo.x), y:gettype(foo.y), pkid:gettype(foo.pkid))?

解决方案

With a standalone VALUES expression PostgreSQL has no idea what the data types should be. With simple numeric literals the system is happy to assume matching types. But with other input (like NULL) you would need to cast explicitly - as you already have found out.

You can query pg_catalog (fast, but PostgreSQL-specific) or the information_schema (slow, but standard SQL) to find out and prepare your statement with appropriate types.

Or you can use one of these simple "tricks" (I saved the best for last):

1. Select row with LIMIT 0, append rows with UNION ALL

UPDATE foo f
SET    x = t.x
     , y = t.y
FROM  (
  (SELECT pkid, x, y FROM foo LIMIT 0) -- parenthesis needed with LIMIT
   UNION ALL
   SELECT 1, 20, NULL
   UNION ALL
   SELECT 2, 50, NULL
   ) t               -- column names and types are already defined
WHERE  f.pkid = t.pkid

The first sub-select of the subquery:

(SELECT x, y, pkid  FROM foo LIMIT 0)

gets names and types for the columns, but LIMIT 0 prevents it from adding an actual row. Subsequent rows are coerced to the now well-defined row type - and checked immediately if they match the type. Should be a subtle additional improvement over your original form.

Major limitation: with separate SELECT lines, Postgres casts the input literals to a "best-effort" type immediately. When it later tries to cast to the given types of the first SELECT, it may already be too late for some types if there is no registered assignment cast between the assumed type and the target type. Example text -> timestamp.

Pro:
- Minimum overhead.
- Readable, simple and fast for few rows.
- You only need to know relevant column names of the table.

Con:
- Type resolution can fail for some types.
- UNION ALL SELECT is slower than VALUES expression for long lists of rows, as you found in your test.
- Verbose syntax per row.

2. VALUES expression with per-column type

...
FROM  (
   VALUES 
     ((SELECT pkid FROM foo LIMIT 0)
    , (SELECT x    FROM foo LIMIT 0)
    , (SELECT y    FROM foo LIMIT 0))  -- get type for each col individually
   , (1, 20, NULL)
   , (2, 50, NULL)
   ) t (pkid, x, y)  -- columns names not defined yet, only types.
...

The first row in the VALUES expression is a row of NULL values which defines the type for all subsequent rows.

Pro:
- Faster than 1.
- Shortest syntax for tables with many columns and only few are relevant.
- You only need to know relevant column names of the table.

Con:
- Verbose syntax for only few rows
- Less readable (IMO).

3. VALUES expression with row type

UPDATE foo f
SET x = (t.r).x       -- parenthesis needed to make syntax unambiguous
  , y = (t.r).y
FROM (
   VALUES
      ('(1,20,)'::foo)  -- columns need to be in default order of table
     ,('(2,50,)')       -- nothing after the last comma for NULL
   ) t (r)              -- column name for row type
WHERE  f.pkid = (t.r).pkid

You obviously know the table name. If you also know the number of columns and their order you can work with this.

For every table in PostgreSQL a row type is registered automatically. If you match the number of columns in your expression, you can cast to the row type of the table ('(1,50,)'::foo) thereby assigning column types implicitly. Put nothing behind a comma to enter a NULL value. Add a comma for every irrelevant trailing column.
In the next step you can access individual columns with the demonstrated syntax. More about Field Selection in the manual.

Or you could add a row of NULL values and use uniform syntax for actual data:

...
  VALUES
      ((NULL::foo))  -- row of NULL values
    , ('(1,20,)')    -- uniform ROW value syntax for all
    , ('(2,50,)')
...

The added row is excluded by the WHERE clause in your UPDATE.
For other purposes you can eliminate the added first row with OFFSET 1 in a subquery.

Pro:
- Fastest (at least in my tests with few rows and columns).
- Shortest syntax for few rows or tables where you need all columns. - You don't have to spell out columns of the table - all columns automatically have the matching name.

Con:
- Not so well known syntax for field selection from record / row / composite type.
- You need to know number and position of relevant columns in default order.

4. VALUES expression with decomposed row type

Like 3., but with decomposed rows in standard syntax:

UPDATE foo f
SET    x = t.x
     , y = t.y
FROM (
   VALUES
      (('(1,20,)'::foo).*)  -- decomposed row of values
    , (2, 50, NULL)
   ) t(pkid, x, y)  -- arbitrary column names (I made them match)
WHERE  f.pkid = t.pkid;

Or, with a leading row of NULL values again:

...
   VALUES
      ((NULL::foo).*)  -- row of NULL values
    , (1, 20, NULL)      -- uniform syntax for all
    , (2, 50, NULL)
...

Pros and cons like 3., but with more commonly known syntax.
And you need to spell out column names (if you need them).

5. VALUES expression with types fetched from row type

Like Unril commented, we can combine the virtues of 2. and 4. to provide only a subset of columns:

UPDATE foo f
SET   (  x,   y)
    = (t.x, t.y)  -- short notation, see below
FROM (
   VALUES
      ((NULL::foo).pkid, (NULL::foo).x, (NULL::foo).y)  -- subset of columns
    , (1, 20, NULL)
    , (2, 50, NULL)
   ) t(pkid, x, y)       -- arbitrary column names (I made them match)
WHERE  f.pkid = t.pkid;

Pros and cons like 4., but we can work with any subset of columns and don't have to know the full list.

Also displaying short syntax for the UPDATE itself that's convenient for cases with many columns. Related:

4. and 5. are my favorites.

这篇关于更新多行时强制转换为NULL类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆