重写“漂亮的网址"时如何处理变音符号(重音符号) [英] How to handle diacritics (accents) when rewriting 'pretty URLs'

查看:134
本文介绍了重写“漂亮的网址"时如何处理变音符号(重音符号)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我重写了URL,以包括用户生成的旅行博客的标题.

I rewrite URLs to include the title of user generated travelblogs.

我这样做是出于URL的可读性和SEO的目的.

I do this for both readability of URLs and SEO purposes.


 http://www.example.com/gallery/280-Gorges_du_Todra/

第一个整数是id,其余的对于我们人类来说(但与请求资源无关).

The first integer is the id, the rest is for us humans (but is irrelevant for requesting the resource).

现在人们可以编写包含任何UTF-8字符的标题,但URL中不允许使用大多数字符. 我的听众通常是说英语的,但是自从他们旅行以来,他们喜欢添加

Now people can write titles containing any UTF-8 character, but most are not allowed in the URL. My audience is generally English speaking, but since they travel, they like to include names like


 Aït Ben Haddou

在Linux上使用PHP进行翻译以显示在URL中的正确方法是什么.

What is the proper way to translate this for displaying in an URL using PHP on linux.

到目前为止,我已经看到了几种解决方案:

So far I've seen several solutions:

  1. 只需去除所有不允许的字符,替换空格 结果很奇怪:
    'Aït Ben Haddou' → /gallery/280-At_Ben_Haddou/
    不太有用.

  1. just strip all non allowed characters, replace spaces this has strange results:
    'Aït Ben Haddou' → /gallery/280-At_Ben_Haddou/
    Not really helpfull.

只需去除所有不允许的字符,替换空格,并保留字符码(stackoverflow.com),最可能的原因是使用了'regex-hammer'
这给出了奇怪的结果: 'tést tést' → /questions/0000/t233st-t233st

just strip all non allowed characters, replace spaces, leave charcode (stackoverflow.com) most likely because of the 'regex-hammer' used
this gives strange results: 'tést tést' → /questions/0000/t233st-t233st

翻译为最近等效项"
'Aït Ben Haddou' → /gallery/280-Ait_Ben_Haddou/
但这对于德国人来说是错误的.例如,ü"应音译为"ue".

translate to 'nearest equivalent'
'Aït Ben Haddou' → /gallery/280-Ait_Ben_Haddou/
But this goes wrong for german; for example 'ü' should be transliterated 'ue'.

对我来说,作为荷兰人,第三名看起来最好".
但是,我非常确定(1)很多人会有不同的看法,(2)在德国示例中这完全是错误的.

For me, as a Dutch person, the 3rd result 'looks' the best.
I'm quite sure however that (1) many people will have a different opinion and (2) it is just plain wrong in the german example.

第3个选项的另一个问题是:如何找到所有可以转换为7位等效字符的字符?

Another problem with the 3rd option is: how to find all possible characters that can be converted to a 7bit equivalent?

所以问题是:

  1. 您认为什么是最理想的结果. (在技术限制内)

  1. what, in your opinion, is the most desirable result. (within tech-limits)

如何从技术上解决它. (达到期望的结果).

How to technically solve it. (reach the desired result) with PHP.

推荐答案

最终,对于这个问题,您将不得不放弃正确"的想法.无论您如何操作,翻译字符串都会破坏兼容性和可读性的名称.这三个选项均兼容,但是#1和#2在可读性方面受到影响.因此,只需运行它,然后选择最适合的方法-选项#3.

Ultimately, you're going to have to give up on the idea of "correct", for this problem. Translating the string, no matter how you do it, destroys accuracy in the name of compatibility and readability. All three options are equally compatible, but #1 and #2 suffer in terms of readability. So just run with it and go for whatever looks best — option #3.

是的,德语翻译是错误的,但是除非您开始要求用户指定其标题所使用的语言(并将其限制为仅一种语言),否则您将不费吹灰之力就能解决该问题.这是值得的. (例如,通过词典针对每种已知语言运行标题中的每个单词,并根据其语言规则翻译该单词的变音符号,这 是可行的,但这太过分了.)

Yes, the translations are wrong for German, but unless you start requiring your users to specify what language their titles are in (and restricting them to only one), you're not going to solve that problem without far more effort than it's worth. (For example, running each word in the title through dictionaries for each known language and translating that word's diacritics according to the rules of its language would work, but it's excessive.)

或者,如果德语比其他语言更受关注,请在存在以下语言的情况下,使翻译始终使用德语版本:äaeëeïiöoeüue.

Alternatively, if German is a higher concern than other languages, make your translation always use the German version when one exists: äae, ëe, ïi, öoe, üue.

哦,关于实际方法,我会通过str_replace翻译特殊情况(如果有的话),然后使用iconv进行其余操作:

Oh, and as for the actual method, I'd translate the special cases, if any, via str_replace, then use iconv for the rest:

$text = str_replace(array("ä", "ö", "ü", "ß"), array("ae", "oe", "ue", "ss"), $text);
$text = iconv('UTF-8', 'US-ASCII//TRANSLIT', $text);

这篇关于重写“漂亮的网址"时如何处理变音符号(重音符号)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆