用多字符串替换字符 [英] Replace characters with multi-character strings

查看:139
本文介绍了用多字符串替换字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试取代德语和荷兰语变音符,例如äü SS 。它们应该写成像 ae 而不是ä。所以我不能简单地翻译一个字符与另一个。



有没有更优雅的方式呢?实际上它看起来像(尚未完成):

  SELECT addr,REPLACE(REPLACE(addr,'¼','ue '),'ß','ss')FROM搜索; 

在我尝试不同命令的路上,我遇到另一个问题:



当我搜索Ü我得到这个:


错误:用于编码UTF8的无效字节序列:0xdc27


尝试使用 U&'\0220' ,它没有取代任何东西。只能使用¼(小写ü),它被正确地替换。不得不用unicode做某事,但是如何解决这个问题?



德国的问候。您的服务器编码似乎是UTF8。

我怀疑你的 client_encoding 不匹配,这可能会给您一个错误的印象。检查:

  SHOW client_encoding; - 在您的实际会话中

并阅读相关答案:

无法在Postgres中插入德文字符

在PostgreSQL中替换unicode字符



工具链的其余部分也必须保持同步。例如,当使用puTTY时,必须确保终端与其余的一致:更改设置...窗口 - >翻译 - >远程字符集 = UTF-8



至于你的第一个问题,你已经有了最好的解决方案。几个变音符最好用一串 replace()语句替换。



你似乎已经知道了同样,单个字符的替换也可以使用(一个)

相关:




I am trying to replace German and Dutch umlauts such as ä, ü, or ß. They should be written like ae instead of ä. So I can't simply translate one char with another.

Is there a more elegant way to do that? Actually it looks like that (not completed yet):

SELECT addr, REPLACE (REPLACE(addr, 'ü','ue'),'ß','ss') FROM search;

On my way trying different commands I got another problem:

When I searched for Ü I got this:

ERROR: invalid byte sequence for encoding "UTF8": 0xdc27

Tried it with U&'\0220', it didn't replace anything. Only by using ü (for lowercase ü) it was replaced correctly. Has to do something with unicode, but how to solve this issue?

Kind regards from Germany. :)

解决方案

Your server encoding seems to be UTF8.
I suspect your client_encoding does not match, which might give you a wrong impression of what you are dealing with. Check with:

SHOW client_encoding;   -- in your actual session

And read this related answers:
Can not insert German characters in Postgres
Replace unicode characters in PostgreSQL

The rest of the tool chain has to be in sync, too. When using puTTY, for instance, one has to make sure, the terminal agrees with the rest: Change settings... Window -> Translation -> Remote character set = UTF-8.

As for your first question, you already have the best solution. A couple of umlauts are best replaced with a string of replace() statements.

As you seem to know already as well, single character replacements are more efficient with (a single) translate() statement.

Related:

这篇关于用多字符串替换字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆