除在bash到永久链接皈依 [英] Subject to permlink convertion in bash

查看:147
本文介绍了除在bash到永久链接皈依的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图转换用户输入,这是为博客条目成永久链接,其被用作URL和文件系统路径的受试者

I'm trying to convert user input, which is a subject for a blog entry into a permlink, which is used as URL and file system path.

我设法使用做到这一点:

I managed to do it using:


echo 'This is a used input, containing junk!!!! öäü' | tr -dc '[:alnum:] ' | tr '[:upper:]' '[:lower:]' | tr -s ' ' '-' | sed -e 's/ö/oe/' | sed -e 's/ü/ue/' | sed -e 's/ä/ae/' | sed -e 's/ß/ss/'

输出:这 - 是 - 含有使用的输入 - - 垃圾非统组织

Output: this-is-a-used-input-containing-junk-oau

在code绝对有效!

但有一个更好的办法做到这一点,而不需要经过这么多的子命令传递字符串?

But is there a nicer way to do this without the need to pass the string through this much subcommands?

任务在执行字符串


  • 转换为小写

  • 替换为空 -

  • 无多 -

  • 特殊字符,变音符号(包括德国的变音是足够的,涵盖所有将是一个加号)

推荐答案

它看起来像你音译字。 的iconv 能处理这个问题:

It looks like you're transliterating characters. iconv can handle this:

$ echo 'ö ä ü ß' | iconv -f utf-8 -t ascii//TRANSLIT
oe ae ue ss

这需要您的区域设置为 de_DE.UTF-8 (或类似的东西)来获得期望的结果(从你的问题和放大器;型材,我已经发你处理德语文本的假设)。

This requires your locale to be set to de_DE.UTF-8 (or something similar) to get the results you expect (from your question & profile, I've made the assumption you're dealing with German text).

要设置这只是iconv命令,使用这样的:

To set this for just the iconv command, use something like:

$ echo 'ö ä ü ß' | LC_ALL=de_DE.UTF-8 iconv -f utf-8 -t ascii//TRANSLIT

这也是你不使用UTF-8,但ISO-8859-1或ISO-8859-15可能;考虑如果可能的话切换到UTF-8,或者相应调整 -f 参数。

不幸的是,GNU TR (即Linux系统)卡在7位ASCII天(!),并且不支持转换的其他任何内容的情况下不是A到Z(它使用了用0x20的绝招XOR)。

Unfortunately, GNU tr (ie. Linux systems) is stuck in the 7-bit ASCII days(!), and doesn't support converting the case of anything other than a to z (it uses the "xor with 0x20 trick").

由于您的字符串反正转换为7位ASCII,我们可以使用 TR 之后的iconv 为它按预期工作:

Since you are converting your string to 7-bit ascii anyway, we can use tr after iconv for it to work as expected:

echo 'ö ä ü ß' | iconv -f utf-8 -t ascii//TRANSLIT | \
    tr '[:upper:]' '[:lower:]' 

我看不出有问题,其他2 TR 调用;他们都做不同的事情。大写转换为小写,除去重复字符,并删除空白。结果
它在一个聪明的命令相结合可能好看了,但也许不那么谁曾来维持它3年的时间的家伙或加仑好: - )

I don't see a problem with the other 2 tr invocations; they all do something different. Convert uppercase to lowercase, remove repeating characters, and remove whitespace.
Combining it in one "smart" command might look good now, but maybe not so good for the guy or gal who has to maintain it in 3 years time :-)

全部放在一起,并加入了一些换行,我们结束了:

Putting it all together, and adding some line breaks, we end up with:

$ echo 'ö ä ü ß' | \
    iconv -f utf-8 -t ascii//TRANSLIT | \
    tr '[:upper:]' '[:lower:]' | \
    tr -dc '[:alnum:] ' | \
    tr -s ' ' '-'

这篇关于除在bash到永久链接皈依的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆