从旧的DOS文件中读取8BIT字符 [英] reading 8BIT chars from old DOS file

查看:80
本文介绍了从旧的DOS文件中读取8BIT字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

嗨!

我想加载一个旧的Pascal-Dos文件,其中

记录就位。当我查看文件时

在HEX-Editor中,很清楚如何在该文件中访问这些

字符串和字符。由于这些
是旧的8BIT字符(C#使用16BIT)我按字节顺序读取文件

并使用ENCODER.getChars将字节

转换为字符()。从字符

我创建一个大字符串,它应该是我在HEX-Editor中看到的文件。但是它有很多错误

,因为getChars()方法似乎没有

可靠地转换所有字符(字节)。

有什么建议吗?


请帮助,

谢谢,

nick

解决方案

Nick< ni ** @ winger.at>写道:

我想加载一个旧的Pascal-Dos-File,其中
记录就位。当我在HEX-Editor中查看文件
时,它是如何清楚的访问这些文件中的字符串和字符。由于这些是旧的8BIT字符(C#使用16BIT),我按字节顺序读取文件并使用ENCODER.getChars()将字节转换为字符。从字符
我创建一个大字符串,应该是我在HEX-Editor中看到的文件。但是它有很多错误,因为getChars()方法似乎没有按比例转换所有字符(字节)。
有什么建议吗?




在获取字符串时需要使用正确的Encoding实例

(通常可以使用GetString而不是GetChars)。什么编码是

它写的?


(或者,只需使用具有适当编码的StreamReader

开头 - 你还需要知道使用了什么编码。)


注意字符串是字符数据,而当你查看

文件时使用十六进制编辑器,您将其视为二进制数据,可能还有

a文本解释(无论编码器的十六进制编辑器感觉是什么编码,也适用于
)。 />

请参阅 http:// www.pobox.com/~skeet/csharp/unicode.html 了解更多

关于这整个主题的信息。


-

Jon Skeet - < sk *** @ pobox.com>
http://www.pobox.com/~skeet

如果回复小组,请不要给我发邮件

嗨!


你提到了

hex编辑器的文本解释。所以在十六进制编辑器中每个BYTE

都是一个字符,这正是我想要的。

因为在十六进制编辑器中我可以阅读所有信息

i想要读出文件,但是用c#i

不能转换每个字节。十六进制编辑器可以检查

编码,还是这个不。编码时每个字节

是一个字符。


谢谢,

nick

Nick< ni ** @ winger.at>写道:

你提到了这个文本解释的十六进制编辑器。所以在十六进制编辑器中,每个字符都是一个字符,这正是我想要的。
因为在十六进制编辑器中我可以阅读所有的信息
我想读出来的文件,但用c#i
不能转换每个字节。十六进制编辑器可以检查编码,还是这个不编码编码时每个字节
是一个字符。




十六进制编辑器不能真正检查编码,因为(比如说)*每个*

文件是一个有效的ISO-8859-1文件,仅作为示例。


当每个字节都是一个字符时,你仍然需要将该字节解释为

a字符 - 这取决于编码。十六进制编辑器很可能

假设编码如Cp1252,或(对于旧的DOS文件)Cp437。


-

Jon Skeet - < sk *** @ pobox.com>
http://www.pobox.com/~skeet

如果回复小组,请不要给我发邮件


Hi !
I want to load an old Pascal-Dos-File where
records stand in. When i view the file
in a HEX-Editor it''s clear how to acces these
Strings and chars in that file. Since these
are old 8BIT chars (C# uses 16BIT) i read
the file bytewise and convert the bytes
to chars using ENCODER.getChars(). From the chars
i make a big String which should be the file as
i see in the HEX-Editor. But there are many errors
in it, as the getChars() Method seem not to
convert all the chars (bytes) propably.
Any suggestions ?

Please Help,
Thanks,
nick

解决方案

Nick <ni**@winger.at> wrote:

I want to load an old Pascal-Dos-File where
records stand in. When i view the file
in a HEX-Editor it''s clear how to acces these
Strings and chars in that file. Since these
are old 8BIT chars (C# uses 16BIT) i read
the file bytewise and convert the bytes
to chars using ENCODER.getChars(). From the chars
i make a big String which should be the file as
i see in the HEX-Editor. But there are many errors
in it, as the getChars() Method seem not to
convert all the chars (bytes) propably.
Any suggestions ?



You need to use the right Encoding instance when grabbing the string
(you can usually use GetString rather than GetChars). What encoding was
it written in?

(Alternatively, just use a StreamReader with the appropriate encoding
to start with - you''ll still need to know what encoding was used.)

Note that strings are character data, whereas when you''re viewing the
file with a hex editor you''re viewing it as binary data, possibly with
a text interpretation (in whatever encoding the hex editor feels is
appropriate) available too.

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information about this whole topic.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too


hi !

you mentioned this text interpretation of the
hex editor. so in the hex editor every BYTE
is a character and that''s exactly what i want.
because in the hex editor i can read all the info
i want to read out of the file, but with c# i
can''t convert every byte. can the hex editor check
the encoding, or is this "no" encoding when every byte
is a character.

thanks,
nick


Nick <ni**@winger.at> wrote:

you mentioned this text interpretation of the
hex editor. so in the hex editor every BYTE
is a character and that''s exactly what i want.
because in the hex editor i can read all the info
i want to read out of the file, but with c# i
can''t convert every byte. can the hex editor check
the encoding, or is this "no" encoding when every byte
is a character.



The hex editor can''t really check the encoding because (say) *every*
file is a valid ISO-8859-1 file, just as an example.

When every byte is a character you still need to interpret that byte as
a character - and that depends on the encoding. The hex editor may well
be assuming an encoding such as Cp1252, or (for an old DOS file) Cp437.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too


这篇关于从旧的DOS文件中读取8BIT字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆