如何使用C Sharp检查文本文件中的非Ascii字符 [英] How do I check Non- Ascii characters inside a text file using C Sharp

查看:55
本文介绍了如何使用C Sharp检查文本文件中的非Ascii字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文本文件,每天都有,并且该文件具有非Ascii字符。我需要使用文本键盘检查文件,找到非ascii字符,然后使用字母表删除这些字符。有没有办法用C sharp自动进行检查?



提前谢谢

I have a text file and which comes everyday and that file is having non- Ascii characters . I need to check the file using text pad and find the non ascii characters and I removed those characters using Alphabets. Is there any possible way using C sharp to do this checking automatically ?

Thanks in advance

推荐答案

这不是太难 - 你只需要决定什么是你感兴趣的有效角色,什么不是。请记住,ASCII字符涵盖了相当宽的范围,从0x00处的空值到0x7F处的DEL,但并非所有这些都是可读/可打印的: http://ascii-table.com/ [ ^ ]

最简单的形式是这样做:

It's not too difficult - you just have to decide what is a valid character you are interested in, and what isn't. Bear in mind that an ASCII character covers a reasonably wide range, from nulls at 0x00 to DEL at 0x7F, but that not all of them are readable / printable: http://ascii-table.com/[^]
The simplest form is to do this:
byte[] bytes = File.ReadAllBytes(@"D:\Temp\inputfile.bin");
byte[] outp = bytes.Where(c => c <= 127).ToArray();
File.WriteAllBytes(@"D:\Temp\outputfile.txt", outp);

哪个字符会丢弃整个ASCII范围之外的字符,但你可能想要限制更多:

Which throws away characters outside the full ASCII range, but you probably want to restrict is somewhat more:

byte[] bytes = File.ReadAllBytes(@"D:\Temp\inputfile.bin");
byte[] outp = bytes.Where(c => c >= 32 && c < 127).ToArray();
File.WriteAllBytes(@"D:\Temp\outputfile.txt", outp);



这也会丢弃所有ASCII控制代码,只留下可打印的字符。但这可能不是你所需要的,因为(我猜)你需要保留一些控制代码 - 我不知道哪些:这是你的应用程序!


Which discards all ASCII control codes as well, and just leaves printable characters. But that probably isn't exactly what you need, since (I'm guessing) you need to keep some of the control codes - which ones I don't know: it's your application!


这篇关于如何使用C Sharp检查文本文件中的非Ascii字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆