关于解压缩二进制文件的问题:endian麻烦 [英] Question about unpacking a binary file: endian troubles
问题描述
大家好,
这可能是一个愚蠢的问题;我刚刚进入C语言。
我编写了一个解压缩二进制文件的程序并写出内容为
a新文件作为无符号整数列表。它可以在带有GCC的AIX下在IBM大型机上运行
,但在使用GCC(DJGPP)的英特尔上遇到了麻烦。我认为
这是一个endian问题。
麻烦的是,尽管提取列表中的大多数值都是正确的,
我知道至少有两个绝对错误的值。它们应该是b $ b $ 24,我得到4294967289(或-7,取决于我是否将输出分别为无符号整数或仅为整数)。
无论如何,它确实看起来像是一个endian mixup给我。
提取程序......
#include< stdio.h>
int main()
{
int fdi,n;
char buf [1];
FILE * fdo,* fopen();
/ *打开二进制位图文件* /
fdi = open(" output.bmp",0);
if(fdi == - 1){
printf(" Can')打开位图文件。\ n");
退出(1);
}
/ *开放小数输出文件* /
fdo = fopen(" output.dec"," w");
/ *复制文件内容* / < (/ = $(读取(fdi,buf,1))> 0){
fprintf(fdo,"%u \ n",(unsigned int) )buf [0]);
}
/ *关闭所有文件* /
关闭(fdi);
关闭(fdo);
}
任何想法让英特尔做什么让它工作?
谢谢,
Dave
David Buchan写道:< blockquote class =post_quotes>大家好,
这可能是一个愚蠢的问题;我刚刚进入C语言。
我编写了一个解压缩二进制文件的程序,并将内容写出来作为无符号整数列表的新文件。它可以在AIX下使用GCC在IBM大型机上运行,但在使用GCC(DJGPP)的英特尔上遇到了麻烦。我认为
这是一个endian问题。
问题是,虽然提取列表中的大多数值都是正确的,但我知道至少有两个值是肯定的错误。它们应该是249而我得到4294967289(或-7,取决于我的输出分别是无符号整数还是整数)。
无论如何,它确定看起来像是一个endian混合对我来说。
我不这么认为,因为你似乎在处理单字节数量的
。 "字节序"只有多字节对象才能发挥作用
:问题是如何组织多个
字节。当只有一个字节时,只有一个组织只有一个组织。
更有可能的是,你是否会对是否
a`char''已签名或未签名:在某些实现中,
`char''始终为零或正,而在其他实现上则为
`char''的值是负数。如果那是你的
问题的根源,那么最简单的(也是最诚实的)修复就是将数据读成无符号字符。而不是简单的char。
您的代码还有一些其他问题:
提取程序......
#include< stdio.h>
你正在使用exit()函数,所以你应该
包括< stdlib.h>声明它。
int main()
`int main(void)''是可取的。对代码的正确性没有影响
,但它被认为是更好的风格
。
{
int fdi,n;
char buf [1];
这应该是`unsigned char buf [1]''。如果你采用
a建议我会再往下走,它可以
即使只是'unsigned char buf''。
FILE * fdo,* fopen();
不要试图写出图书馆的自由声明
函数;你有时会弄错他们,即使你把它们搞定了,你也可能会错过实施
magic。这使他们更有效地工作。在这个
的情况下,你已经包含了< stdio.h>所以fopen()是
已经宣布(正确有效);不要
泥泞的水域。
/ *打开二进制位图文件* /
fdi = open(" output.bmp",0);
对不起; open()不是C函数。使用fopen()
和rb (读取,二进制)作为第二个参数和
然后使用getc()来读取每个字符。 `fdi''变成
a`FILE *''而不是'int'',后续测试
的失败与NULL而不是-1相比。 />
if(fdi == - 1){
printf(不能打开位图文件。\ n);
exit(1);
退出(EXIT_FAILURE)更好,因为退出
状态为1意味着不同的东西不同
系统。 EXIT_FAILURE在< stdlib.h>,
中声明为了正确声明exit(),
记得吗?
另外,你可以'返回EXIT_FAILURE;''
因为这是main()函数。
}
/ *打开十进制输出文件* /
fdo = fopen(" output.dec"," w");
添加`if(fdo == NULL)...''检查失败。
/ *复制文件内容* /
while((n = read(fdi,buf,1))> 0){
切换到使用fopen()和getc()时
输入文件,你几乎肯定会遇到程序速度的突然增加。您编写的代码
可能是读取数据最慢的方式。
fprintf(fdo," %u \ n",(unsigned int)buf [0]);
}
/ *关闭所有文件* /
close(fdi);
这将成为fclose()。
close(fdo);
首先应该是fclose()。
就目前而言,这是完全错误的。
最后,你需要`return EXIT_SUCCESS;''所以你的
`int'' - 值main()可以将某些东西传回
环境。 脱离结束,即使它已经被允许放弃最新版本的
C标准,但在旧标准下是不正确的
和一直是一种邋。的迹象。
}
任何想法让英特尔做什么才能让它发挥作用?
Eric Sosman< er ********* @ sun.com>写道:David Buchan写道:
[...]/ * Open二进制位图文件* /
fdi = open(" output.bmp",0);
抱歉; open()不是C函数。将fopen()
与rb一起使用(读取,二进制)作为第二个参数,然后使用getc()读取每个字符。 `fdi''变成'FILE *''而不是`int'',后续的失败测试与NULL比较而不是-1。
使用open()并不是完全错误的,它只是不可移植的。 open()是一个
标准函数,它只是由POSIX标准定义,而不是由
C标准定义。如果你愿意,你可以使用它,但它有一些缺点:
你的程序不再可以移植到支持ISO的实现,而不是POSIX,你可以'在comp.lang.c中得到很好的答案。
但在这种情况下,我认为使用open()而不是
没有任何优势fopen(),以及一些潜在的缺点。
-
Keith Thompson(The_Other_Keith) ks *** @ mib.org < http://www.ghoti.net/~kst>
圣地亚哥超级计算机中心< *> < http://users.sdsc.edu/~kst>
我们必须做点什么。这是事情。因此,我们必须这样做。
Eric Sosman写道:
David Buchan写道:
大家好,
这可能是一个愚蠢的问题;我刚刚进入C语言。
我编写了一个解压缩二进制文件的程序,并将内容写出来作为无符号整数列表的新文件。它可以在AIX下使用GCC在IBM大型机上运行,但在使用GCC(DJGPP)的英特尔上遇到了麻烦。我认为
这是一个endian问题。
问题是,虽然提取列表中的大多数值都是正确的,但我知道至少有两个值是肯定的错误。它们应该是249而我得到4294967289(或-7,取决于我的输出分别是无符号整数还是整数)。
无论如何,它确定看起来像是一个endian mixup给我。
我不这么认为,因为你似乎在处理单字节数量。 "字节序"只有多字节对象才能发挥作用:问题是如何组织多个
字节。当只有一个字节时,只有一个组织。
更可能的是,你是否会对是否char感到困惑签名或未签名:在某些实现中,char始终为零或正,而在其他实现上,某些char值为负。如果这是你的问题的根源,那么最简单(也是最诚实的)修复就是将数据读入unsigned char而不是简单的char。 。
也许数据在MSB
机器上以2或4字节整数写入文件?
Bj?
[snip]
Hi guys,
This may be a dumb question; I''m just getting into C language here.
I wrote a program to unpack a binary file and write out the contents to
a new file as a list of unsigned integers. It works on an IBM mainframe
under AIX with GCC, but has trouble on Intel with GCC (DJGPP). I think
it''s an endian problem.
The trouble is, although most values in the extracted list are correct,
I know of at least two values that are definitely wrong. They should be
249 and I get 4294967289 (or -7, depending on whether I have the output
to unsigned integer or just integer, respectively).
Anyway, it sure looks like an endian mixup to me.
The extract program...
#include <stdio.h>
int main ()
{
int fdi, n;
char buf[1];
FILE *fdo, *fopen();
/* Open binary bitmap file */
fdi=open("output.bmp", 0);
if (fdi==-1) {
printf ("Can''t open bitmap file.\n");
exit (1);
}
/* Open decimal output file */
fdo=fopen("output.dec", "w");
/* Copy contents of file */
while ((n=read (fdi, buf, 1))>0) {
fprintf (fdo, "%u\n", (unsigned int) buf[0]);
}
/* Close all files */
close (fdi);
close (fdo);
}
Any idea what to do on Intel to make it work?
Thanks,
Dave
David Buchan wrote:Hi guys,
This may be a dumb question; I''m just getting into C language here.
I wrote a program to unpack a binary file and write out the contents to
a new file as a list of unsigned integers. It works on an IBM mainframe
under AIX with GCC, but has trouble on Intel with GCC (DJGPP). I think
it''s an endian problem.
The trouble is, although most values in the extracted list are correct,
I know of at least two values that are definitely wrong. They should be
249 and I get 4294967289 (or -7, depending on whether I have the output
to unsigned integer or just integer, respectively).
Anyway, it sure looks like an endian mixup to me.
I don''t think so, since you seem to be dealing with
single-byte quantities. "Endianness" only comes into play
with multi-byte objects: the question is how the multiple
bytes are organized. When there''s only one byte, there''s
only one organization.
More likely, you''re running into confusion over whether
a `char'' is signed or unsigned: on some implementations a
`char'' is always zero or positive, while on others some
`char'' values are negative. If that''s the root of your
problem, the simplest (and most "honest") fix is to read
the data into an `unsigned char'' rather than a plain `char''.
There are a few other issues with your code:
The extract program...
#include <stdio.h>
You''re using the exit() function, so you should
include <stdlib.h> to declare it.
int main ()
`int main(void)'' is preferable. There''s no effect
on the correctness of the code, but it''s considered
better style.
{
int fdi, n;
char buf[1];
This should be `unsigned char buf[1]''. If you adopt
a suggestion I''ll make a little further down, it can
even be just `unsigned char buf''.
FILE *fdo, *fopen();
DON''T try to write free-hand declarations of library
functions; you''ll sometimes get them wrong and even when
you get them right you may miss out on "implementation
magic" that makes them work more efficiently. In this
case you''ve already included <stdio.h> so fopen() is
already declared (correctly and efficiently); don''t
muddy the waters.
/* Open binary bitmap file */
fdi=open("output.bmp", 0);
Sorry; open() is not a C function. Use fopen()
with "rb" (read, binary) as the second argument and
then use getc() to read each character. `fdi'' becomes
a `FILE*'' instead of an `int'', and the subsequent test
for failure compares against NULL instead of -1.
if (fdi==-1) {
printf ("Can''t open bitmap file.\n");
exit (1);
exit(EXIT_FAILURE) is better, because an exit
status of `1'' means different things to different
systems. EXIT_FAILURE is declared in <stdlib.h>,
which you included so as to declare exit() properly,
remember?
Alternatively, you could `return EXIT_FAILURE;''
since this is the main() function.
}
/* Open decimal output file */
fdo=fopen("output.dec", "w");
Add `if (fdo == NULL) ...'' to check for failure.
/* Copy contents of file */
while ((n=read (fdi, buf, 1))>0) {
When you switch to using fopen() and getc() on
the input file, you''ll almost certainly experience a
sudden increase in your program''s speed. The code
you''ve written is likely to be the slowest possible
way to read the data.
fprintf (fdo, "%u\n", (unsigned int) buf[0]);
}
/* Close all files */
close (fdi);
This would become fclose().
close (fdo);
This should have been fclose() in the first place.
As it stands, it''s flat-out wrong.
Finally, you need `return EXIT_SUCCESS;'' so your
`int''-valued main() can pass something back to the
environment. "Falling off the end," even though it''s
been granted an indulgence in the latest version of
the C Standard, was incorrect under the older Standard
and has always been a sign of sloppiness.
}
Any idea what to do on Intel to make it work?
Eric Sosman <er*********@sun.com> writes:David Buchan wrote:
[...]/* Open binary bitmap file */
fdi=open("output.bmp", 0);
Sorry; open() is not a C function. Use fopen()
with "rb" (read, binary) as the second argument and
then use getc() to read each character. `fdi'' becomes
a `FILE*'' instead of an `int'', and the subsequent test
for failure compares against NULL instead of -1.
Using open() isn''t exactly wrong, it''s just non-portable. open() is a
standard function, it''s just defined by the POSIX standard, not by the
C standard. You can use it if you like, but it has some drawbacks:
your program is no longer portable to implementations that support ISO
C but not POSIX, and you can''t get good answers about it in comp.lang.c.
But in this case, I see no advantage in using open() rather than
fopen(), and a number of potential disadvantages.
--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Eric Sosman wrote:
David Buchan wrote:Hi guys,
This may be a dumb question; I''m just getting into C language here.
I wrote a program to unpack a binary file and write out the contents to
a new file as a list of unsigned integers. It works on an IBM mainframe
under AIX with GCC, but has trouble on Intel with GCC (DJGPP). I think
it''s an endian problem.
The trouble is, although most values in the extracted list are correct,
I know of at least two values that are definitely wrong. They should be
249 and I get 4294967289 (or -7, depending on whether I have the output
to unsigned integer or just integer, respectively).
Anyway, it sure looks like an endian mixup to me.
I don''t think so, since you seem to be dealing with
single-byte quantities. "Endianness" only comes into play
with multi-byte objects: the question is how the multiple
bytes are organized. When there''s only one byte, there''s
only one organization.
More likely, you''re running into confusion over whether
a `char'' is signed or unsigned: on some implementations a
`char'' is always zero or positive, while on others some
`char'' values are negative. If that''s the root of your
problem, the simplest (and most "honest") fix is to read
the data into an `unsigned char'' rather than a plain `char''.
Maybe the data was written to the file as 2 or 4 byte integers on a MSB
machine?
Bj?rn
[snip]
这篇关于关于解压缩二进制文件的问题:endian麻烦的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!