如何删除特殊字符? [英] How can I delete special characters?
问题描述
我正在练习使用 Ruby 和正则表达式来删除某些不需要的字符.例如:
I'm practicing with Ruby and regex to delete certain unwanted characters. For example:
input = input.gsub(/<\/?[^>]*>/, '')
对于特殊字符,例如 ☻ 或 ™:
and for special characters, example ☻ or :
input = input.gsub('&#', '')
这里只剩下数字了,好吧.但这仅在用户输入特殊字符作为代码时才有效,如下所示:
This leaves only numbers, ok. But this only works if the user enters a special character as a code, like this:
™
我的问题:如果用户输入没有代码的特殊字符,我如何删除特殊字符,如下所示:
My question: How I can delete special characters if the user enters a special character without code, like this:
™ ☻
推荐答案
首先,我认为定义什么构成正确输入"并删除其他所有内容可能更容易.例如:
First of all, I think it might be easier to define what constitutes "correct input" and remove everything else. For example:
input = input.gsub(/[^0-9A-Za-z]/, '')
如果这不是您想要的(您想支持非拉丁字母等),那么我认为您应该列出要删除的字形(如 ™ 或 ☻),并将它们删除一个-一个,因为很难通过编程方式区分中文、阿拉伯语等字符和象形文字.
If that's not what you want (you want to support non-latin alphabets, etc.), then I think you should make a list of the glyphs you want to remove (like ™ or ☻), and remove them one-by-one, since it's hard to distinguish between a Chinese, Arabic, etc. character and a pictograph programmatically.
最后,您可能希望通过转换为 HTML 转义序列或从 HTML 转义序列来规范化您的输入.
Finally, you might want to normalize your input by converting to or from HTML escape sequences.
这篇关于如何删除特殊字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!