PostgreSQL 各种清理字符串\varchar [英] PostgreSQL various clean up of string \ varchar

查看:347
本文介绍了PostgreSQL 各种清理字符串\varchar的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须以下列方式清理一些 varchar:

I have to clean up some varchar in the following manner:

  1. 去除特殊字符,如:!、@、#、$、%、^、&、*、(、)、}、{、[、]、、"、.、?,/, ', 来自封闭列表.我已经通过大量使用 replace\regexp_replace 设法做到了这一点,但我正在寻找类似于 SQL Server 中的东西.

  1. Remove special characters such as: !, @, #, $, %, ^, &, *, (, ), }, {, [, ], ",", ., ?, /, ', from a closed list. I've managed to do so with a mass use of replace\regexp_replace but I'm looking for something similar to the one in SQL server.

删除以下数字但不删除相邻的数字含义:

Remove following numbers but not adjacent ones meaning:

round 1 -->回合
round1 -->第一轮
round 12345 -->回合
round12345 -->round12345

从封闭的单词列表中删除单词,例如:and"、or"、Ltd."、international",不带子字符串:

Remove words out of a closed list of words such as: "and", "or", "Ltd.", "international" without substrings:

更多的食物或饮料--->更多的食物饮料.而不是-->我的食物或饮料

我正在处理一张大桌子,我希望尽可能高效.
我应该写一个函数来做到这一点还是有更优雅的方法?

I'm working on a big table and I'd like to do it as efficient as possible.
Should I write a function that does that or is there a more elegant way?

推荐答案

我会一起玩.这是问题 2:

SELECT trim(regexp_replace(
   '12 foo1 run 1457 ABC 1Foo 2456 bar 34',
   '\s*\m\d+\M\s*',
   ' ',
   'g'
   ));

返回:

foo1 run ABC 1Foo bar

我更新了答案以使用约束表达式而不是括号表达式,因为手册通知:

I updated the answer to use constraint expressions instead of the bracket expressions, because the manual informs:

下面描述的约束转义通常是可取的;他们不再标准,但更容易打字.

The constraint escapes described below are usually preferable; they are no more standard, but are easier to type.

\s* .. 零个或多个空格
\m .. 一个词的开头(同[[:<:]])
\d+ .. 1 位或更多位
\M .. 词尾(同[[:>:]])

\s* .. zero or more white space
\m .. beginning of a word (same as [[:<:]])
\d+ .. 1 or more digits
\M .. end of a word (same as [[:>:]])

需要第四个参数 'g' 来替换全局",而不仅仅是第一次出现.

The 4th parameter 'g' is needed to replace "globally", not just the first occurrence.

->sqlfiddle for v9.2
->sqlfiddle v8.4 不起作用

->sqlfiddle for v9.2
->sqlfiddle for v8.4 doesn't work

standard_conforming_strings.默认值随 v9.1 更改.

standard_conforming_strings. The default changed with v9.1.

这个适用于两个世界,可以说是兼容模式".但是上面现代版本的语法(结合 standard_conforming_strings = on)更简洁.

This works in both worlds, "compatibility mode" so to say. But the syntax of the modern version above (in combination with standard_conforming_strings = on) is cleaner.

SELECT trim(regexp_replace(
  '12 foo1 run 1457 ABC 1Foo 2456 bar 34',
  E'\\s*\\m\\d+\\M\\s*',
  ' ',
  'g'
));

->sqlfiddle

这篇关于PostgreSQL 各种清理字符串\varchar的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆