删除javascript字符串中的变音符或特殊字符 [英] remove umlauts or specialchars in javascript string

查看:56
本文介绍了删除javascript字符串中的变音符或特殊字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以前从未在JavaScript字符串中使用变音符号或特殊字符播放过。我的问题是如何删除它们?

Never played before with umlauts or specialchars in javascript strings. My problem is how to remove them?

例如,我在javascript中有此内容:

For example I have this in javascript:

var oldstr = "Bayern München";
var str = oldstr.split(' ').join('-');

结果是Bayern-München可以轻松完成,但现在我想删除像这样的变音符号或特殊字符:

Result is Bayern-München ok easy, but now I want to remove the umlaut or specialchar like:


希洪竞技场。

Real Sporting de Gijón.

如何我能意识到这一点吗?

How can I realize this?

亲切的问候,

弗兰克

推荐答案

替换应该可以为您完成此操作,例如:

replace should be able to do it for you, e.g.:

var str = str.replace(/ü/g, 'u');

...当然ü u 不是是同一字母。 :-)

...of course ü and u are not the same letter. :-)

如果您尝试用某种字符(例如- ),您可以通过指定范围来做到这一点:

If you're trying to replace all characters outside a given range with something (like a -), you can do that by specifying a range:

var str = str.replace(/[^A-Za-z0-9\-_]/g, '-');

替换不是英文字母,数字,- _ -。 (字符范围是 [...] 位,开头的 ^ 表示不是。) 这是一个实时示例

That replaces all characters that aren't English letters, digits, -, or _ with -. (The character range is the [...] bit, the ^ at the beginning means "not".) Here's a live example.

但是( -M-nchen),对于München先生来说可能有点不愉快。 :-)您可以使用传递给 replace 的函数来尝试删除变音符号:

But that ("Bayern-M-nchen") may be a bit unpleasant for Mr. München to look at. :-) You could use a function passed into replace to try to just drop diacriticals:

var str = str.replace(/[^A-Za-z0-9\-_]/g, function(ch) {
  // Character that look a bit like 'a'
  if ("áàâä".indexOf(ch) >= 0) { // There are a lot more than this
    return 'a';
  }
  // Character that look a bit like 'u'
  if ("úùûü".indexOf(ch) >= 0) { // There are a lot more than this
    return 'u';
  }
  /* ...long list of others...*/
  // Default
  return '-';
});

实时示例

上面的代码针对长字符串进行了优化。如果字符串本身很短,则反复进行正则表达式可能会更好:

The above is optimized for long strings. If the string itself is short, you may be better off with repeated regexps:

var str = str.replace(/[áàâä]/g, 'a')
             .replace(/[úùûü]/g, 'u')
             .replace(/[^A-Za-z0-9\-_]/g, '-');

...但这只是推测。

...but that's speculative.

请注意,JavaScript字符串中的文字字符完全可以使用,但是您可以在文件编码时遇到麻烦。我倾向于坚持使用unicode转义符。因此,例如,上面是:

Note that literal characters in JavaScript strings are totally fine, but you can run into fun with encoding of files. I tend to stick to unicode escapes. So for instance, the above would be:

var str = str.replace(/[\u00e4\u00e2\u00e0\u00e1]/g, 'a')
             .replace(/[\u00fc\u00fb\u00f9\u00fa]/g, 'u')
             .replace(' ','-');

...但同样,还有很多 要做。 ..

...but again, there are a lot more to do...

这篇关于删除javascript字符串中的变音符或特殊字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆