如何从UTF8字符串控制字符 [英] How to remove control chars from UTF8 string

查看：337 发布时间：2015/11/26 1:14:03 .net regex vb.net utf-8

本文介绍了如何从UTF8字符串控制字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个处理文档内容的VB.NET程序。该PROGRAMM处理大容量的文件为批（> 200万文件;总1TB容量）其中的一些文件可能包含控制字符或字符像f0e8（http://www.fileformat.info/info/uni$c$c/char/f0e8/browsertest.htm）。

i have a VB.NET program that handles the content of documents. The programm handles high volumes of documents as "batch"(>2Million documents;total 1TB volume) Some of this documents may contain control chars or chars like f0e8(http://www.fileformat.info/info/unicode/char/f0e8/browsertest.htm).

有一个简单的，特别是快的方式来删除字符？（除空格，换行，标签，...）如果答案是正则表达式：？有没有人一个的完成的正则表达式我

Is there a easy and especially fast way to remove that chars?(except space,newline,tab,...) If the answer is regex: Has anyone a complete regex for me?

谢谢！

推荐答案

尝试

resultString = Regex.Replace(subjectString, "\p{C}+", "");

这将从字符串中删除所有的其他统一code字（控制，格式，私人使用，替代，和未分配）。

This will remove all "other" Unicode characters (control, format, private use, surrogate, and unassigned) from your string.

这篇关于如何从UTF8字符串控制字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从UTF8字符串控制字符 [英] How to remove control chars from UTF8 string

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

如何从UTF8字符串控制字符 [英] How to remove control chars from UTF8 string

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭