来自StreamReader的原始文件字节,魔术数检测 [英] Original file bytes from StreamReader, magic number detection
问题描述
我试图区分文本文件和二进制文件,因为我实际上想忽略具有不可读内容的文件。
I'm trying to differentiate between "text files" and "binary" files, as I would effectively like to ignore files with "unreadable" contents.
I有一个我认为是GZIP存档的文件。我正试图通过检测幻数/文件签名来忽略此类文件。如果使用Notepad ++中的Hex编辑器插件打开文件,则可以看到前三个十六进制代码为 1f 8b 08
。
I have a file that I believe is a GZIP archive. I'm tring to ignore this kind of file by detecting the magic numbers / file signature. If I open the file with the Hex editor plugin in Notepad++ I can see the first three hex codes are 1f 8b 08
.
但是,如果我使用 StreamReader
读取文件,则不确定如何获取原始字节。
However if I read the file using a StreamReader
, I'm not sure how to get to the original bytes..
using (var streamReader = new StreamReader(@"C:\file"))
{
char[] buffer = new char[10];
streamReader.Read(buffer, 0, 10);
var s = new String(buffer);
byte[] bytes = new byte[6];
System.Buffer.BlockCopy(s.ToCharArray(), 0, bytes, 0, 6);
var hex = BitConverter.ToString(bytes);
var otherhex = BitConverter.ToString(System.Text.Encoding.UTF8.GetBytes(s.ToCharArray()));
}
在using语句的结尾,我有以下变量值:
At the end of the using statement I have the following variable values:
hex: "1F-00-FD-FF-08-00"
otherhex: "1F-EF-BF-BD-08-00-EF-BF-BD-EF-BF-BD-0A-51-02-03"
都不是以记事本++中显示的十六进制值开头。
Neither of which start with the hex values shown in Notepad++.
是否可以通过<$读取文件的结果来获取原始字节c $ c> StreamReader ?
推荐答案
您的代码尝试将二进制缓冲区更改为字符串。字符串在NET中是Unicode,因此需要两个字节。如您所见,结果有点不可预测。
Your code tries to change a binary buffer into a string. Strings are Unicode in NET so two bytes are required. The resulting is a bit unpredictable as you can see.
只需使用BinaryReader及其 ReadBytes 方法
Just use a BinaryReader and its ReadBytes method
using(FileStream fs = new FileStream(@"C:\file", FileMode.Open, FileAccess.Read))
{
using (var reader = new BinaryReader(fs, new ASCIIEncoding()))
{
byte[] buffer = new byte[10];
buffer = reader.ReadBytes(10);
if(buffer[0] == 31 && buffer[1] == 139 && buffer[2] == 8)
// you have a signature match....
}
}
这篇关于来自StreamReader的原始文件字节,魔术数检测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!