是什么原因导致计算机将文件识别为某种文件类型?以及如何更改它(使用java)? [英] What causes the computer to recognize a file as a certain file type? and how can I change it (with java)?

查看:192
本文介绍了是什么原因导致计算机将文件识别为某种文件类型?以及如何更改它(使用java)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在java中创建一个程序来读取文件的输入流,通过根据密码更改字节数来加密它,并创建一个新的加密文件。

I am crating a program in java that reads an Input Stream of a file, encrypts it by changing around the numbers of the bytes based on what the password is, and creates a new encrypted file.

例如:

我创建了一个包含以下字词的测试文件:

这是验证加密器项目是否有效的测试。

当我读取java中的字节时,我得到:

[84 ,104,105,115,32,105,115,32,97,32,116,101,115,116,32,116,111,32,115,101,101,32,105,102,32,116 ,104,101,32,101,110,99,114,121,112,116,101,114,32,112,114,111,106,101,99,116,32,119,111,114,107 ,115,46,10]

然后我取每个字节的值,并减去密码的unicode值,并得到它的绝对值。然后我把它写到一个文件。

For example:
I created a test file that contained the words:
This is a test to see if the encrypter project works.
When I read the bytes in java, I get:
[84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 116, 101, 115, 116, 32, 116, 111, 32, 115, 101, 101, 32, 105, 102, 32, 116, 104, 101, 32, 101, 110, 99, 114, 121, 112, 116, 101, 114, 32, 112, 114, 111, 106, 101, 99, 116, 32, 119, 111, 114, 107, 115, 46, 10]
So then I take the value of each byte, and subtract the unicode value of the passwords, and get the absolute value of that. Then I write that to a file.

我正在玩不同的算法来加密它,并开始在测试文本文件上测试它。我正在使用Linux,所以没有文件扩展名(例如.txt,.pdf等...)我注意到几次加密后,计算机不再将其识别为文本文件,而是,作为图像文件! (意思是当你点击它时,默认情况下,它会尝试在图像编辑器中打开文件)

I was playing around with different algorithms to encrypt it, and started testing it out on a test text file. I am using Linux, so so there are no file extensions (eg. .txt, .pdf, etc...) I noticed after a few times of encrypting it, that the computer no longer recognized it as a text file, but instead, as an image file! (meaning when you click on it, by default, it tries to open the file in an image editor)

所以这是我的问题:


  • 我猜它与某些字节有关看着文件中的某个地方,但除此之外,我迷路了。


  • 我希望能够将文件保持为相同的文件类型,即使加密后也是如此,所以我在想,如果,例如,如果文件类型信息在前10个字节中,我会在之后对所有内容进行加密,但保留前10个字节,例如。


  • 这些字节是否具有所有平台的标准含义(即pdf文件是pdf file no mater是什么计算机你使用它。这是因为 .pdf 扩展名,还是因为文件中某处的字节。)

  • Do these bytes have a meaning that is standard across all platforms (ie. a pdf file is a pdf file no mater what computer you use it on. Is that because of the .pdf extension, or is it because of the bytes that are somewhere in the file.)

  • 我在哪里可以找到列出哪些字节意味着文件中的内容?

推荐答案

在传统的UNIX系统上,文件是仅通过查找文件中出现的特定字节模式来识别。

On traditional UNIX systems, files are identified solely by looking for particular patterns of bytes appearing in the file.

文件命令使用 magic 配置文件(通常 / etc / magic ,或 / usr / share / file / magic )包含定义这些字节模式的规则。

The file command uses a magic configuration file (often /etc/magic, or /usr/share/file/magic) which contains the rules defining those byte patterns.

就是这样 - 没有特别的额外元数据 - 这些都是通过分析内容完成的。

That's it - there's no special extra meta-data - it's all done by analysis of the content.

这篇关于是什么原因导致计算机将文件识别为某种文件类型?以及如何更改它(使用java)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆