了解`阅读,write`系统调用的Unix [英] Understanding `read, write` system calls in Unix

查看:171
本文介绍了了解`阅读,write`系统调用的Unix的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的系统编程项目都有我们实现一个COM pression / DECOM pression计划通过消除零最高位写入输出到一个单独的文件,这取决于紧缩下来ASCII文本文件是否COM pression或DECOM pression日常工作。要做到这一点,教授要求我们使用的二进制文件和Unix系统调用,其中包括打开,关闭,读,写

My Systems Programming project has us implementing a compression/decompression program to crunch down ASCII text files by removing the zero top bit and writing the output to a separate file, depending on whether the compression or decompression routine is working. To do this, the professor has required us to use the binary files and Unix system calls, which include open, close, read, write, etc.

从我的读取和写入的理解,它读取定义为字节块的二进制数据。然而,由于该数据是二进制的,我不知道如何分析它。

From my understanding of read and write, it reads the binary data by defined byte chunks. However, since this data is binary, I'm not sure how to parse it.

这是我的code的一个精简版,减去错误检查:

This is a stripped down version of my code, minus the error checking:

void compress(char readFile[]){

  char buffer[BUFFER] //buffer size set to 4096, but tunable to system preference
  int openReadFile;
  openReadFile= open(readFile, O_RDONLY);
}

如果我用来将数据读入缓存,将在缓冲区中的数据是二进制或字符格式?什么我已经遇到地址的细节,以及它如何解析的内容非常相关的。

If I use read to read the data into buffer, will the data in buffer be in binary or character format? Nothing I've come across addresses that detail, and its very relevant to how I parse the contents.

推荐答案

阅读()将读取的字节数没有任何跨pretation(所以二进制模式)。

read() will read the bytes in without any interpretation (so "binary" mode).

作为二进制文件,并且你要访问的每个字节,应使用unsigned char型的缓冲
无符号字符缓冲区[缓冲] 。你可以把字符/ unsigned char型为字节,他们将在Linux上8位。

Being binary, and you want to access the individual bytes, you should use a buffer of unsigned char unsigned char buffer[BUFFER]. You can regard char/unsigned char as bytes, they'll be 8 bits on linux.

现在,既然你要处理的是8位的ASCII COM pressed下降到7位,你必须对那些7位再次转换成8位,这样就可以使数据的意义。

Now, since what you're dealing with is 8 bit ascii compressed down to 7 bit, you'll have to convert those 7 bits into 8 bits again so you can make sense of the data.

要解释什么是已经完成 - 考虑文本。这就是3个字节。这些字节将各有8位,并以ASCII那是位模式:

To explain what's been done - consider the text Hey .That's 3 bytes. The bytes will have 8 bits each, and in ascii that's the bit patterns :

01001000 01100101 01111001

01001000 01100101 01111001

现在,从该除去最显著位,则一个位移剩余位到左侧。

Now, removing the most significant bit from this, you shift the remaining bits one bit to the left.

X1001000 X1100101 X1111001

X1001000 X1100101 X1111001

以上,X为以除去比特。消除这些,和不断变化的你结束了这个模式字节他人:

Above, X is the bit to removed. Removing those, and shifting the others you end up with bytes with this pattern:

10010001 10010111 11001000

10010001 10010111 11001000

最右边的3位是刚刚填写在0到目前为止,没有空间,虽然保存。还是有3个字节。
随着8字节的字符串,我们就节省了100字节因为这将COM preSS到7个字节。

The rightmost 3 bits is just filled in with 0. So far, no space is saved though. There's still 3 bytes. With a string of 8 bytes, we'd saved 1 byte as that would compress down to 7 bytes.

现在你要做的字节反向你读过回

这篇关于了解`阅读,write`系统调用的Unix的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆