如何从字符串中删除控制字符? [英] How to remove control characters from string?

查看:67
本文介绍了如何从字符串中删除控制字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的页面上有一个表单,用户可以在其中键入一些文本并提交.然后将文本发送到服务器(在node.js之上的REST API)并保存到数据库(postgres).

I have form on my page where user can type some text and submit it. Text is then sent to server (REST API on top of node.js) and saved to DB (postgres).

问题在于某些奇怪的字符(控制字符)偶尔会保存到DB中-例如转义控制字符(^ [)或退格控制字符(^ H).通常,它不会破坏任何内容,因为这些字符是不可见的,因此html可以正确呈现.但是,当我为RSS阅读器提供xml内容时,由于这些控制字符,它们(阅读器)会返回格式错误的XML"(删除它们后就可以使用).

The problem is that some strange characters (control characters) are saved to DB occasionaly - for example escape control character (^[) or backspace control character (^H). Generally it does not break anything since those characters are invisible, so html is rendered correctly. However when I provide xml content for RSS readers, they (readers) return "Malformed XML" because of those control characters (it works after deleting them).

我的问题是如何从客户端级别(javascript)或服务器级别(javascript/node.js)的字符串中删除这些字符?

My question is how I can remove those characters from a string on client level (javascript) or server level (javascript/node.js)?

推荐答案

以Unicode控制的字符位于代码点U + 0000至U + 001F和U + 007F至U + 009F.使用RegExp查找那些控制字符,并将它们替换为空字符串:

Control characters in Unicode are at codepoints U+0000 through U+001F and U+007F through U+009F. Use a RegExp to find those control characters and replace them with an empty string:

str.replace(/[\u0000-\u001F\u007F-\u009F]/g, "")

如果要删除其他字符,请将字符添加到

If you want to remove additional characters, add the characters to the character class inside the RegExp. For example, to remove U+200B ZERO WIDTH SPACE as well, add \u200B before the ].

这篇关于如何从字符串中删除控制字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆