PostgreSQL 各种清理字符串\varchar [英] PostgreSQL various clean up of string \ varchar
问题描述
我必须以下列方式清理一些 varchar:
I have to clean up some varchar in the following manner:
去除特殊字符,如:
!、@、#、$、%、^、&、*、(、)、}、{、[、]、、"、.、?,/, ',
来自封闭列表.我已经通过大量使用 replace\regexp_replace 设法做到了这一点,但我正在寻找类似于 SQL Server 中的东西.
Remove special characters such as:
!, @, #, $, %, ^, &, *, (, ), }, {, [, ], ",", ., ?, /, ',
from a closed list. I've managed to do so with a mass use of replace\regexp_replace but I'm looking for something similar to the one in SQL server.
删除以下数字但不删除相邻的数字含义:
Remove following numbers but not adjacent ones meaning:
round 1 -->回合
round1 -->第一轮
round 12345 -->回合
round12345 -->round12345
从封闭的单词列表中删除单词,例如:and"、or"、Ltd."、international",不带子字符串:
Remove words out of a closed list of words such as: "and", "or", "Ltd.", "international" without substrings:
更多的食物或饮料--->更多的食物饮料.而不是-->我的食物或饮料
我正在处理一张大桌子,我希望尽可能高效.
我应该写一个函数来做到这一点还是有更优雅的方法?
I'm working on a big table and I'd like to do it as efficient as possible.
Should I write a function that does that or is there a more elegant way?
推荐答案
我会一起玩.这是问题 2:
SELECT trim(regexp_replace(
'12 foo1 run 1457 ABC 1Foo 2456 bar 34',
'\s*\m\d+\M\s*',
' ',
'g'
));
返回:
foo1 run ABC 1Foo bar
我更新了答案以使用约束表达式而不是括号表达式,因为手册通知:
I updated the answer to use constraint expressions instead of the bracket expressions, because the manual informs:
下面描述的约束转义通常是可取的;他们不再标准,但更容易打字.
The constraint escapes described below are usually preferable; they are no more standard, but are easier to type.
\s*
.. 零个或多个空格\m
.. 一个词的开头(同[[:<:]]
)\d+
.. 1 位或更多位\M
.. 词尾(同[[:>:]]
)
\s*
.. zero or more white space
\m
.. beginning of a word (same as [[:<:]]
)
\d+
.. 1 or more digits
\M
.. end of a word (same as [[:>:]]
)
需要第四个参数 'g'
来替换全局",而不仅仅是第一次出现.
The 4th parameter 'g'
is needed to replace "globally", not just the first occurrence.
->sqlfiddle for v9.2
->sqlfiddle v8.4 不起作用
->sqlfiddle for v9.2
->sqlfiddle for v8.4 doesn't work
standard_conforming_strings
.默认值随 v9.1 更改.
standard_conforming_strings
. The default changed with v9.1.
这个适用于两个世界,可以说是兼容模式".但是上面现代版本的语法(结合 standard_conforming_strings = on
)更简洁.
This works in both worlds, "compatibility mode" so to say. But the syntax of the modern version above (in combination with standard_conforming_strings = on
) is cleaner.
SELECT trim(regexp_replace(
'12 foo1 run 1457 ABC 1Foo 2456 bar 34',
E'\\s*\\m\\d+\\M\\s*',
' ',
'g'
));
这篇关于PostgreSQL 各种清理字符串\varchar的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!