关于解压缩二进制文件的问题:endian麻烦 [英] Question about unpacking a binary file: endian troubles

查看:51
本文介绍了关于解压缩二进制文件的问题:endian麻烦的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,


这可能是一个愚蠢的问题;我刚刚进入C语言。


我编写了一个解压缩二进制文件的程序并写出内容为

a新文件作为无符号整数列表。它可以在带有GCC的AIX下在IBM大型机上运行
,但在使用GCC(DJGPP)的英特尔上遇到了麻烦。我认为

这是一个endian问题。


麻烦的是,尽管提取列表中的大多数值都是正确的,

我知道至少有两个绝对错误的值。它们应该是b $ b $ 24,我得到4294967289(或-7,取决于我是否将输出分别为无符号整数或仅为整数)。


无论如何,它确实看起来像是一个endian mixup给我。


提取程序......


#include< stdio.h>


int main()

{

int fdi,n;

char buf [1];

FILE * fdo,* fopen();


/ *打开二进制位图文件* /

fdi = open(" output.bmp",0);

if(fdi == - 1){

printf(" Can')打开位图文件。\ n");

退出(1);

}


/ *开放小数输出文件* /

fdo = fopen(" output.dec"," w");


/ *复制文件内容* / < (/ = $(读取(fdi,buf,1))> 0){

fprintf(fdo,"%u \ n",(unsigned int) )buf [0]);

}


/ *关闭所有文件* /

关闭(fdi);

关闭(fdo);

}


任何想法让英特尔做什么让它工作?


谢谢,

Dave

解决方案

David Buchan写道:< blockquote class =post_quotes>大家好,

这可能是一个愚蠢的问题;我刚刚进入C语言。

我编写了一个解压缩二进制文件的程序,并将内容写出来作为无符号整数列表的新文件。它可以在AIX下使用GCC在IBM大型机上运行,​​但在使用GCC(DJGPP)的英特尔上遇到了麻烦。我认为
这是一个endian问题。

问题是,虽然提取列表中的大多数值都是正确的,但我知道至少有两个值是肯定的错误。它们应该是249而我得到4294967289(或-7,取决于我的输出分别是无符号整数还是整数)。

无论如何,它确定看起来像是一个endian混合对我来说。


我不这么认为,因为你似乎在处理单字节数量的
。 "字节序"只有多字节对象才能发挥作用
:问题是如何组织多个

字节。当只有一个字节时,只有一个组织只有一个组织。


更有可能的是,你是否会对是否

a`char''已签名或未签名:在某些实现中,

`char''始终为零或正,而在其他实现上则为

`char''的值是负数。如果那是你的
问题的根源,那么最简单的(也是最诚实的)修复就是将数据读成无符号字符。而不是简单的char。


您的代码还有一些其他问题:

提取程序......

#include< stdio.h>


你正在使用exit()函数,所以你应该

包括< stdlib.h>声明它。

int main()


`int main(void)''是可取的。对代码的正确性没有影响

,但它被认为是更好的风格



{
int fdi,n;
char buf [1];


这应该是`unsigned char buf [1]''。如果你采用

a建议我会再往下走,它可以

即使只是'unsigned char buf''。

FILE * fdo,* fopen();


不要试图写出图书馆的自由声明

函数;你有时会弄错他们,即使你把它们搞定了,你也可能会错过实施

magic。这使他们更有效地工作。在这个

的情况下,你已经包含了< stdio.h>所以fopen()是

已经宣布(正确有效);不要
泥泞的水域。

/ *打开二进制位图文件* /
fdi = open(" output.bmp",0);


对不起; open()不是C函数。使用fopen()

和rb (读取,二进制)作为第二个参数和

然后使用getc()来读取每个字符。 `fdi''变成

a`FILE *''而不是'int'',后续测试

的失败与NULL而不是-1相比。 />
if(fdi == - 1){
printf(不能打开位图文件。\ n);
exit(1);


退出(EXIT_FAILURE)更好,因为退出

状态为1意味着不同的东西不同

系统。 EXIT_FAILURE在< stdlib.h>,

中声明为了正确声明exit(),

记得吗?


另外,你可以'返回EXIT_FAILURE;''

因为这是main()函数。

}

/ *打开十进制输出文件* /
fdo = fopen(" output.dec"," w");


添加`if(fdo == NULL)...''检查失败。

/ *复制文件内容* /
while((n = read(fdi,buf,1))> 0){


切换到使用fopen()和getc()时

输入文件,你几乎肯定会遇到程序速度的突然增加。您编写的代码

可能是读取数据最慢的方式。

fprintf(fdo," %u \ n",(unsigned int)buf [0]);
}
/ *关闭所有文件* /
close(fdi);


这将成为fclose()。

close(fdo);


首先应该是fclose()。

就目前而言,这是完全错误的。


最后,你需要`return EXIT_SUCCESS;''所以你的

`int'' - 值main()可以将某些东西传回

环境。 脱离结束,即使它已经被允许放弃最新版本的

C标准,但在旧标准下是不正确的

和一直是一种邋。的迹象。

}

任何想法让英特尔做什么才能让它发挥作用?




-
Er*********@sun.com


Eric Sosman< er ********* @ sun.com>写道:

David Buchan写道:



[...]

/ * Open二进制位图文件* /
fdi = open(" output.bmp",0);



抱歉; open()不是C函数。将fopen()
与rb一起使用(读取,二进制)作为第二个参数,然后使用getc()读取每个字符。 `fdi''变成'FILE *''而不是`int'',后续的失败测试与NULL比较而不是-1。




使用open()并不是完全错误的,它只是不可移植的。 open()是一个

标准函数,它只是由POSIX标准定义,而不是由

C标准定义。如果你愿意,你可以使用它,但它有一些缺点:

你的程序不再可以移植到支持ISO的实现,而不是POSIX,你可以'在comp.lang.c中得到很好的答案。


但在这种情况下,我认为使用open()而不是

没有任何优势fopen(),以及一些潜在的缺点。


-

Keith Thompson(The_Other_Keith) ks *** @ mib.org < http://www.ghoti.net/~kst>

圣地亚哥超级计算机中心< *> < http://users.sdsc.edu/~kst>

我们必须做点什么。这是事情。因此,我们必须这样做。


Eric Sosman写道:

David Buchan写道:

大家好,

这可能是一个愚蠢的问题;我刚刚进入C语言。

我编写了一个解压缩二进制文件的程序,并将内容写出来作为无符号整数列表的新文件。它可以在AIX下使用GCC在IBM大型机上运行,​​但在使用GCC(DJGPP)的英特尔上遇到了麻烦。我认为
这是一个endian问题。

问题是,虽然提取列表中的大多数值都是正确的,但我知道至少有两个值是肯定的错误。它们应该是249而我得到4294967289(或-7,取决于我的输出分别是无符号整数还是整数)。

无论如何,它确定看起来像是一个endian mixup给我。



我不这么认为,因为你似乎在处理单字节数量。 "字节序"只有多字节对象才能发挥作用:问题是如何组织多个
字节。当只有一个字节时,只有一个组织。

更可能的是,你是否会对是否char感到困惑签名或未签名:在某些实现中,char始终为零或正,而在其他实现上,某些char值为负。如果这是你的问题的根源,那么最简单(也是最诚实的)修复就是将数据读入unsigned char而不是简单的char。 。




也许数据在MSB

机器上以2或4字节整数写入文件?


Bj?


[snip]


Hi guys,

This may be a dumb question; I''m just getting into C language here.

I wrote a program to unpack a binary file and write out the contents to
a new file as a list of unsigned integers. It works on an IBM mainframe
under AIX with GCC, but has trouble on Intel with GCC (DJGPP). I think
it''s an endian problem.

The trouble is, although most values in the extracted list are correct,
I know of at least two values that are definitely wrong. They should be
249 and I get 4294967289 (or -7, depending on whether I have the output
to unsigned integer or just integer, respectively).

Anyway, it sure looks like an endian mixup to me.

The extract program...

#include <stdio.h>

int main ()
{
int fdi, n;
char buf[1];
FILE *fdo, *fopen();

/* Open binary bitmap file */
fdi=open("output.bmp", 0);
if (fdi==-1) {
printf ("Can''t open bitmap file.\n");
exit (1);
}

/* Open decimal output file */
fdo=fopen("output.dec", "w");

/* Copy contents of file */
while ((n=read (fdi, buf, 1))>0) {
fprintf (fdo, "%u\n", (unsigned int) buf[0]);
}

/* Close all files */
close (fdi);
close (fdo);
}

Any idea what to do on Intel to make it work?

Thanks,
Dave

解决方案

David Buchan wrote:

Hi guys,

This may be a dumb question; I''m just getting into C language here.

I wrote a program to unpack a binary file and write out the contents to
a new file as a list of unsigned integers. It works on an IBM mainframe
under AIX with GCC, but has trouble on Intel with GCC (DJGPP). I think
it''s an endian problem.

The trouble is, although most values in the extracted list are correct,
I know of at least two values that are definitely wrong. They should be
249 and I get 4294967289 (or -7, depending on whether I have the output
to unsigned integer or just integer, respectively).

Anyway, it sure looks like an endian mixup to me.
I don''t think so, since you seem to be dealing with
single-byte quantities. "Endianness" only comes into play
with multi-byte objects: the question is how the multiple
bytes are organized. When there''s only one byte, there''s
only one organization.

More likely, you''re running into confusion over whether
a `char'' is signed or unsigned: on some implementations a
`char'' is always zero or positive, while on others some
`char'' values are negative. If that''s the root of your
problem, the simplest (and most "honest") fix is to read
the data into an `unsigned char'' rather than a plain `char''.

There are a few other issues with your code:
The extract program...

#include <stdio.h>
You''re using the exit() function, so you should
include <stdlib.h> to declare it.
int main ()
`int main(void)'' is preferable. There''s no effect
on the correctness of the code, but it''s considered
better style.
{
int fdi, n;
char buf[1];
This should be `unsigned char buf[1]''. If you adopt
a suggestion I''ll make a little further down, it can
even be just `unsigned char buf''.
FILE *fdo, *fopen();
DON''T try to write free-hand declarations of library
functions; you''ll sometimes get them wrong and even when
you get them right you may miss out on "implementation
magic" that makes them work more efficiently. In this
case you''ve already included <stdio.h> so fopen() is
already declared (correctly and efficiently); don''t
muddy the waters.
/* Open binary bitmap file */
fdi=open("output.bmp", 0);
Sorry; open() is not a C function. Use fopen()
with "rb" (read, binary) as the second argument and
then use getc() to read each character. `fdi'' becomes
a `FILE*'' instead of an `int'', and the subsequent test
for failure compares against NULL instead of -1.
if (fdi==-1) {
printf ("Can''t open bitmap file.\n");
exit (1);
exit(EXIT_FAILURE) is better, because an exit
status of `1'' means different things to different
systems. EXIT_FAILURE is declared in <stdlib.h>,
which you included so as to declare exit() properly,
remember?

Alternatively, you could `return EXIT_FAILURE;''
since this is the main() function.
}

/* Open decimal output file */
fdo=fopen("output.dec", "w");
Add `if (fdo == NULL) ...'' to check for failure.
/* Copy contents of file */
while ((n=read (fdi, buf, 1))>0) {
When you switch to using fopen() and getc() on
the input file, you''ll almost certainly experience a
sudden increase in your program''s speed. The code
you''ve written is likely to be the slowest possible
way to read the data.
fprintf (fdo, "%u\n", (unsigned int) buf[0]);
}

/* Close all files */
close (fdi);
This would become fclose().
close (fdo);
This should have been fclose() in the first place.
As it stands, it''s flat-out wrong.

Finally, you need `return EXIT_SUCCESS;'' so your
`int''-valued main() can pass something back to the
environment. "Falling off the end," even though it''s
been granted an indulgence in the latest version of
the C Standard, was incorrect under the older Standard
and has always been a sign of sloppiness.
}

Any idea what to do on Intel to make it work?



--
Er*********@sun.com


Eric Sosman <er*********@sun.com> writes:

David Buchan wrote:


[...]

/* Open binary bitmap file */
fdi=open("output.bmp", 0);



Sorry; open() is not a C function. Use fopen()
with "rb" (read, binary) as the second argument and
then use getc() to read each character. `fdi'' becomes
a `FILE*'' instead of an `int'', and the subsequent test
for failure compares against NULL instead of -1.



Using open() isn''t exactly wrong, it''s just non-portable. open() is a
standard function, it''s just defined by the POSIX standard, not by the
C standard. You can use it if you like, but it has some drawbacks:
your program is no longer portable to implementations that support ISO
C but not POSIX, and you can''t get good answers about it in comp.lang.c.

But in this case, I see no advantage in using open() rather than
fopen(), and a number of potential disadvantages.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.


Eric Sosman wrote:

David Buchan wrote:

Hi guys,

This may be a dumb question; I''m just getting into C language here.

I wrote a program to unpack a binary file and write out the contents to
a new file as a list of unsigned integers. It works on an IBM mainframe
under AIX with GCC, but has trouble on Intel with GCC (DJGPP). I think
it''s an endian problem.

The trouble is, although most values in the extracted list are correct,
I know of at least two values that are definitely wrong. They should be
249 and I get 4294967289 (or -7, depending on whether I have the output
to unsigned integer or just integer, respectively).

Anyway, it sure looks like an endian mixup to me.


I don''t think so, since you seem to be dealing with
single-byte quantities. "Endianness" only comes into play
with multi-byte objects: the question is how the multiple
bytes are organized. When there''s only one byte, there''s
only one organization.

More likely, you''re running into confusion over whether
a `char'' is signed or unsigned: on some implementations a
`char'' is always zero or positive, while on others some
`char'' values are negative. If that''s the root of your
problem, the simplest (and most "honest") fix is to read
the data into an `unsigned char'' rather than a plain `char''.



Maybe the data was written to the file as 2 or 4 byte integers on a MSB
machine?

Bj?rn

[snip]


这篇关于关于解压缩二进制文件的问题:endian麻烦的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆