Linux如何将文件识别为某种文件类型,以及如何以编程方式更改它? [英] How does Linux recognize a file as a certain file type, and how to programmatically change it?

查看:81
本文介绍了Linux如何将文件识别为某种文件类型,以及如何以编程方式更改它?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用Java创建一个程序,该程序读取文件的输入流,并根据密码是通过更改字节数来对其进行加密,然后创建一个新的加密文件.

I am crating a program in java that reads an Input Stream of a file, encrypts it by changing around the numbers of the bytes based on what the password is, and creates a new encrypted file.

例如:
我创建了一个包含以下单词的测试文件:
This is a test to see if the encrypter project works.
当我在Java中读取字节时,得到:
[84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 116, 101, 115, 116, 32, 116, 111, 32, 115, 101, 101, 32, 105, 102, 32, 116, 104, 101, 32, 101, 110, 99, 114, 121, 112, 116, 101, 114, 32, 112, 114, 111, 106, 101, 99, 116, 32, 119, 111, 114, 107, 115, 46, 10]
因此,我将获取每个字节的值,然后减去密码的unicode值,并获得该值的绝对值.然后我将其写入文件.

For example:
I created a test file that contained the words:
This is a test to see if the encrypter project works.
When I read the bytes in java, I get:
[84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 116, 101, 115, 116, 32, 116, 111, 32, 115, 101, 101, 32, 105, 102, 32, 116, 104, 101, 32, 101, 110, 99, 114, 121, 112, 116, 101, 114, 32, 112, 114, 111, 106, 101, 99, 116, 32, 119, 111, 114, 107, 115, 46, 10]
So then I take the value of each byte, and subtract the unicode value of the passwords, and get the absolute value of that. Then I write that to a file.

我正在尝试使用不同的算法对其进行加密,然后开始在测试文本文件中对其进行测试.我使用的是Linux,因此没有文件扩展名(例如.txt,.pdf等),经过几次加密后,我注意到计算机不再将其识别为文本文件,而是,作为图像文件! (也就是说,默认情况下,当您单击它时,它将尝试在图像编辑器中打开文件)

I was playing around with different algorithms to encrypt it, and started testing it out on a test text file. I am using Linux, so so there are no file extensions (eg. .txt, .pdf, etc...) I noticed after a few times of encrypting it, that the computer no longer recognized it as a text file, but instead, as an image file! (meaning when you click on it, by default, it tries to open the file in an image editor)

这是我的问题:

  • 我猜想它与文件中某些字节有关,但除此之外,我迷路了.
  • 我希望即使加密后也能将文件保持为相同的文件类型,所以我在想,例如,如果文件类型信息位于前10个字节中,我将对所有内容进行加密在那之后,但例如将前十个字节留空.
  • 这些字节的含义是否在所有平台上都是标准的(即pdf文件是pdf文件,与使用它的计算机无关)是因为.pdf扩展名,还是因为文件中某处的字节.)
  • Do these bytes have a meaning that is standard across all platforms (ie. a pdf file is a pdf file no mater what computer you use it on. Is that because of the .pdf extension, or is it because of the bytes that are somewhere in the file.)
  • 在哪里可以找到文件中什么字节的清单?

推荐答案

在传统的UNIX系统上,仅通过查找文件中出现的特定字节模式来识别文件.

On traditional UNIX systems, files are identified solely by looking for particular patterns of bytes appearing in the file.

file命令使用magic配置文件(通常为/etc/magic/usr/share/file/magic),该文件包含定义这些字节模式的规则.

The file command uses a magic configuration file (often /etc/magic, or /usr/share/file/magic) which contains the rules defining those byte patterns.

就是这样-没有特殊的额外元数据-全部通过内容分析来完成.

That's it - there's no special extra meta-data - it's all done by analysis of the content.

这篇关于Linux如何将文件识别为某种文件类型,以及如何以编程方式更改它?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆