来自StreamReader的原始文件字节,魔术数检测 [英] Original file bytes from StreamReader, magic number detection

查看:142
本文介绍了来自StreamReader的原始文件字节,魔术数检测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图区分文本文件和二进制文件,因为我实际上想忽略具有不可读内容的文件。

I'm trying to differentiate between "text files" and "binary" files, as I would effectively like to ignore files with "unreadable" contents.

I有一个我认为是GZIP存档的文件。我正试图通过检测幻数/文件签名来忽略此类文件。如果使用Notepad ++中的Hex编辑器插件打开文件,则可以看到前三个十六进制代码为 1f 8b 08

I have a file that I believe is a GZIP archive. I'm tring to ignore this kind of file by detecting the magic numbers / file signature. If I open the file with the Hex editor plugin in Notepad++ I can see the first three hex codes are 1f 8b 08.

但是,如果我使用 StreamReader 读取文件,则不确定如何获取原始字节。

However if I read the file using a StreamReader, I'm not sure how to get to the original bytes..

using (var streamReader = new StreamReader(@"C:\file"))
{
    char[] buffer = new char[10];
    streamReader.Read(buffer, 0, 10);
    var s = new String(buffer);

    byte[] bytes = new byte[6];
    System.Buffer.BlockCopy(s.ToCharArray(), 0, bytes, 0, 6);
    var hex = BitConverter.ToString(bytes);

    var otherhex = BitConverter.ToString(System.Text.Encoding.UTF8.GetBytes(s.ToCharArray()));
}

在using语句的结尾,我有以下变量值:

At the end of the using statement I have the following variable values:

hex: "1F-00-FD-FF-08-00"
otherhex: "1F-EF-BF-BD-08-00-EF-BF-BD-EF-BF-BD-0A-51-02-03"

都不是以记事本++中显示的十六进制值开头。

Neither of which start with the hex values shown in Notepad++.

是否可以通过<$读取文件的结果来获取原始字节c $ c> StreamReader ?

推荐答案

您的代码尝试将二进制缓冲区更改为字符串。字符串在NET中是Unicode,因此需要两个字节。如您所见,结果有点不可预测。

Your code tries to change a binary buffer into a string. Strings are Unicode in NET so two bytes are required. The resulting is a bit unpredictable as you can see.

只需使用BinaryReader及其 ReadBytes 方法

Just use a BinaryReader and its ReadBytes method

using(FileStream fs = new FileStream(@"C:\file", FileMode.Open, FileAccess.Read))
{
    using (var reader = new BinaryReader(fs, new ASCIIEncoding()))
    {
        byte[] buffer = new byte[10];
        buffer = reader.ReadBytes(10);
        if(buffer[0] == 31 && buffer[1] == 139 && buffer[2] == 8)
            // you have a signature match....
    }
}

这篇关于来自StreamReader的原始文件字节,魔术数检测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆