将整个数据库中的空字符串('')设置为NULL [英] Set empty strings ('') to NULL in the whole database
问题描述
在我的数据库中,有许多文本列,其中值是空字符串(''
).空字符串需要设置为NULL
.我不知道此数据库中的确切架构,表和列,或者我想写一个可以重用的通用解决方案.
In my database are many text columns where values are empty strings (''
). The empty strings need to be set to NULL
. I do not know the exact schemas, tables and columns in this database or rather I want to write a general solution which can be reused.
我该如何编写查询/函数以在所有架构的所有表中查找所有文本列,并将所有具有空字符串(''
)的列更新为NULL
?
How would I write a query / function to find all text columns in all tables in all schemas and update all columns with empty strings (''
) to NULL
?
推荐答案
最有效的方法:
- 每个表运行一个
UPDATE
. - 仅使用任何实际的空字符串更新可为空的列(未定义
NOT NULL
). - 仅使用任何实际的空字符串更新行.
- 保留其他值不变.
- Run a single
UPDATE
per table. - Only update nullable columns (not defined
NOT NULL
) with any actual empty string. - Only update rows with any actual empty string.
- Leave other values unchanged.
此相关答案具有plpgsql函数,该函数针对任何给定表使用系统目录pg_attribute
自动安全地构建并运行UPDATE
命令:
This related answer has a plpgsql function that builds and runs the UPDATE
command using system catalog pg_attribute
automatically and safely for any given table:
使用此答案中的函数f_empty2null()
,您可以像这样遍历选定的表:
Using the function f_empty2null()
from this answer, you can loop through selected tables like this:
DO
$do$
DECLARE
_tbl regclass;
BEGIN
FOR _tbl IN
SELECT c.oid::regclass
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'r' -- only regular tables
AND n.nspname NOT LIKE 'pg_%' -- exclude system schemas
LOOP
RAISE NOTICE $$PERFORM f_empty2null('%');$$, _tbl;
-- PERFORM f_empty2null(_tbl); -- uncomment to prime the bomb
END LOOP;
END
$do$;
小心!.这将更新数据库中所有用户表的所有列中的所有空字符串.确保这就是您想要的,否则可能会破坏数据库.
Careful! This updates all empty strings in all columns of all user tables in the DB. Be sure that's what you want or it might nuke your database.
您当然需要对所有选定表具有UPDATE
特权.
You need UPDATE
privileges on all selected tables, of course.
作为儿童安全装置,我评论了有效载荷.
As a child safety device I commented the payload.
您可能已经注意到,我直接使用系统目录,而不是信息模式(也可以使用).关于此:
You may have noted that I use the system catalogs directly, not the information schema (which would work, too). About this:
- How to check if a table exists in a given schema
- Query to return output column names and data types of a query, table or view
这里是重复使用的集成解决方案.没有安全装置:
Here is an integrated solution for repeated use. Without safety devices:
CREATE OR REPLACE FUNCTION f_all_empty2null(OUT _tables int, OUT _rows int) AS
$func$
DECLARE
_typ CONSTANT regtype[] := '{text, bpchar, varchar, \"char\"}';
_sql text;
_row_ct int;
BEGIN
_tables := 0; _rows := 0;
FOR _sql IN
SELECT format('UPDATE %s SET %s WHERE %s'
, t.tbl
, string_agg(format($$%1$s = NULLIF(%1$s, '')$$, t.col), ', ')
, string_agg(t.col || $$ = ''$$, ' OR '))
FROM (
SELECT c.oid::regclass AS tbl, quote_ident(attname) AS col
FROM pg_namespace n
JOIN pg_class c ON c.relnamespace = n.oid
JOIN pg_attribute a ON a.attrelid = c.oid
WHERE n.nspname NOT LIKE 'pg_%' -- exclude system schemas
AND c.relkind = 'r' -- only regular tables
AND a.attnum >= 1 -- exclude tableoid & friends
AND NOT a.attisdropped -- exclude dropped columns
AND NOT a.attnotnull -- exclude columns defined NOT NULL!
AND a.atttypid = ANY(_typ) -- only character types
ORDER BY a.attnum
) t
GROUP BY t.tbl
LOOP
EXECUTE _sql;
GET DIAGNOSTICS _row_ct = ROW_COUNT; -- report nr. of affected rows
_tables := _tables + 1;
_rows := _rows + _row_ct;
END LOOP;
END
$func$ LANGUAGE plpgsql;
致电:
SELECT * FROM pg_temp.f_all_empty2null();
返回:
_tables | _rows
---------+---------
23 | 123456
注意 我如何正确地转义了表名和列名!
Note how I escaped both table and columns names properly!
c.oid::regclass AS tbl, quote_ident(attname) AS col
考虑:
小心!.与上述警告相同.
还要考虑我上面链接的答案中的基本解释:
Careful! Same warning as above.
Also consider the basic explanation in the answer I linked above:
这篇关于将整个数据库中的空字符串('')设置为NULL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!