通过查看文件的二进制内容以编程方式查找文件类型.可能的? [英] Programatically find out a file type by looking its binary content. Possible?

查看:132
本文介绍了通过查看文件的二进制内容以编程方式查找文件类型.可能的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个C#组件,它将接收以下类型的文件 .doc,.pdf,.xls,.rtf

I have a c# component that will recieve a file of the following types .doc, .pdf, .xls, .rtf

这些将由调用siebel旧版应用程序作为文件流发送.

These will be sent by the calling siebel legacy app as a filestream.

所以...

[LegacyApp] >> {二进制文件流} >> [Component]

[LegacyApp] >> {Binary file stream} >> [Component]

旧版应用程序是一个黑匣子,无法修改以告知组件它要发送的文件类型(doc,pdf,xls).该组件需要读取此二进制流,并在文件系统上使用正确的扩展名创建文件.

The legacy app is a black box that cant be modified to tell the component what file type (doc,pdf,xls) it is sending. The component needs to read this binary stream and create a file on the filesystem with the right extension.

有什么想法吗?

感谢您的时间.

推荐答案

在基于Linux/Unix的系统上,您可以使用file命令,但是我想您想自己在代码中手动完成此操作...

On Linux/Unix based systems you can use the file command, but I assume you want to do this manually yourself in code...

如果您只能访问文件的字节流,那么您将需要独立处理每种文件类型.

If all you have access to is the byte stream of the file, then you would need to handle each file type independently.

大多数您想知道的程序/组件通常会读取前几个字节,并以此为基础进行分类.例如,GIF文件以下列之一开头:GIF87a或GIF89a

Most programs/components that do what you are wondering usually read the first few bytes and make a classification based on that. For example GIF files start with one of the following: GIF87a or GIF89a

许多文件格式在文件的开头具有相同的签名,或者具有相同的标头格式.此签名称为 magic我在这篇文章中描述的数字.

Many file formats have the same signature at the start of the file, or have the same header format. This signature is refered to as a magic number as described by me on this post.

一个入门的好地方是访问 www.wotsit.org .它包含可通过文件类型搜索的文件格式规范.您可以查看要处理的重要文件类型,并查看是否可以在这些文件格式中找到一些识别因素.

A good place to get started is to go to www.wotsit.org. It contains the file format specifications searchable by file type. You could look at the important file types that you want to handle and see if you can find some identifying factor in those file formats.

您还可以搜索Google以尝试找到进行此分类的库,或者查看file命令的源代码.

You could also search Google to try and find a library that does this classification, or look at the source code of the file command.

这篇关于通过查看文件的二进制内容以编程方式查找文件类型.可能的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆