读一个Ascii字符串 [英] Reading an Ascii string

查看:55
本文介绍了读一个Ascii字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述




我初学者正在使用C#和.net。


我有大量遗留文件存储各种值(整数,字节,字符串)并希望将它们读入

a C#程序,以便我可以将它们存储在数据库中。这些文件是由1980年代后期的PC

Pascal程序编写的,我没有源代码。我设法对文件进行反向工程

格式。


字符串在文件中存储为Ascii,第一个字节表示字符串长度,和

其余的是Ascii(即8位)字符。字符串长度总是0,20或40个字符

(从不再)和字符串在必要时用空格字符填充。


什么是快速读取字符串并最后摆脱空间填充的最佳方法?为了使

确保我能正确阅读它们,我会把它们放在一个文本框中。我假设测试盒中使用的字符串

使用16位字符(unicode?)但我可能在这里错了。当我很高兴我能正确阅读它们时,

我将摆脱文本框并将它们直接存储在数据库中。最好将它作为unicode存储在

数据库中吗?我很想使用Ascii来提高效率。


我正在考虑使用二进制阅读器(_br)从文件中提取。对于

一切都应该没问题,但我不知道如何处理Ascii字符串。

Hi,

I''m a beginner is using C# and .net.

I have big legacy files that stores various values (ints, bytes, strings) and want to read them into
a C# programme so that I can store them in a database. The files are written by a late 1980''s PC
Pascal programme, for which I don''t have the source code. I''ve managed to reverse engineer the file
format.

The strings are stored as Ascii in the file, with the first byte indicating the string length, and
the rest are the Ascii (ie 8-bit) characters. The string length is always 0, 20 or 40 characters
(never any more) and strings are end-padded with space characters where necessary.

What is the best way to quickly read a string and get rid of the space padding at the end? To make
sure I can read them correctly, I''ll put them in a text box. I assume the string used in a test box
uses 16-bit characters (unicode?) but I may be wrong here. When I''m happy I can read them correctly,
I''ll get rid of the text box and store them directly in the database. Is it best to store it in the
database as unicode? I''m tempted to use Ascii for efficiency.

I was thinking of using a binary reader (_br) to extract from the file. That should be fine for
everything, but I don''t know how to cope the the Ascii strings.

推荐答案

嗨John,


首先,ASCII是7位,127以上的任何字符都需要正确的编码才能正确读取。

我'' m假设字符存储为8位。


您可以直接读取FileStream,也可以从文本框中读取单个字符串。


你将需要一个循环,在这种情况下将是类似


index = 0

ArrayList字符串


while(index< data of data)

{

numbytes = data [index]

index ++


将下一个numbytes字节复制到一个新字符串

strings.Add(newstring)


index + = numbytes


如果需要,删除空格填充

index ++如果需要

}


将数据视为一个最简单的方法可能是最简单的char数组或者作为一个字符串,在这种情况下,文本框应该很容易。使用FileStream,您需要将文件读取为ASCII。


File.ReadAllText(filepath,System.Text.Encoding.ASCII); (C#2.0)


如果文件不是ASCII,但是使用全部8位数据,则​​需要通过反复试验找出正确的编码。

-

快乐的编码!

Morten Wennevik [C#MVP]
Hi John,

First, ASCII is 7-bit, and any character above 127 will need the proper encoding to be read right.
I''m assuming the characters are stored as 8 bit.

You can either read the FileStream directly or as a single string from atextbox.

You will need a loop, which in this case would be something like

index = 0
ArrayList strings

while(index < length of data)
{
numbytes = data[index]
index++

copy the next numbytes bytes to a new string
strings.Add(newstring)

index+= numbytes

remove space padding if needed
index++ if needed
}

It may be easiest to treat the data as a char array or as a string, in which case a textbox should be easy enough. Using a FileStream you wouldneed to read the file as ASCII.

File.ReadAllText(filepath, System.Text.Encoding.ASCII); (C# 2.0)

If the file isn''t ASCII, but uses all 8 bits for data, then you need to figure out the correct encoding by trial and error.
--
Happy coding!
Morten Wennevik [C# MVP]


查看System.Text命名空间,特别是编码,编码器和

解码器类。


-

HTH,

Kevin Spencer

微软MVP

专业鸡肉沙拉炼金术士


大厚片组成很多小小的。

" John" < -wrote in message news:OM ************** @ TK2MSFTNGP03.phx.gbl ...
Check out the System.Text Namespace, specifically the Encoding, Encoder, and
Decoder classes.

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Chicken Salad Alchemist

Big thicks are made up of lots of little thins.
"John" <-wrote in message news:OM**************@TK2MSFTNGP03.phx.gbl...




我初学者正在使用C#和.net。


我有大的遗留文件存储各种值(整数,字节,字符串)

并希望将它们读入

a C#程序,以便我可以将它们存储在数据库中。这些文件是由1980年代后期的PC写的

Pascal程序,我没有源代码。我已经设法

对文件进行反向工程

格式。


字符串在文件中存储为Ascii,第一个字节

表示字符串长度,

其余为Ascii(即8位)字符。字符串长度总是

0,20或40个字符

(从来没有)和字符串用空格字符结尾填充

必要的。


快速读取字符串并摆脱空间的最佳方法是什么?b
最后填充?为了使

确保我能正确阅读它们,我会把它们放在一个文本框中。我假设测试盒中使用的

字符串

使用16位字符(unicode?)但我可能在这里错了。当我很高兴

我可以正确阅读它们,

我将摆脱文本框并将它们直接存储在数据库中。

最好将它作为unicode存储在

数据库中吗?我很想使用Ascii来提高效率。


我正在考虑使用二进制阅读器(_br)从文件中提取。

那应该是什么都好,但是我不知道如何处理Ascii字符串。

Hi,

I''m a beginner is using C# and .net.

I have big legacy files that stores various values (ints, bytes, strings)
and want to read them into
a C# programme so that I can store them in a database. The files are
written by a late 1980''s PC
Pascal programme, for which I don''t have the source code. I''ve managed to
reverse engineer the file
format.

The strings are stored as Ascii in the file, with the first byte
indicating the string length, and
the rest are the Ascii (ie 8-bit) characters. The string length is always
0, 20 or 40 characters
(never any more) and strings are end-padded with space characters where
necessary.

What is the best way to quickly read a string and get rid of the space
padding at the end? To make
sure I can read them correctly, I''ll put them in a text box. I assume the
string used in a test box
uses 16-bit characters (unicode?) but I may be wrong here. When I''m happy
I can read them correctly,
I''ll get rid of the text box and store them directly in the database. Is
it best to store it in the
database as unicode? I''m tempted to use Ascii for efficiency.

I was thinking of using a binary reader (_br) to extract from the file.
That should be fine for
everything, but I don''t know how to cope the the Ascii strings.



John写道:
John wrote:




我初学者正在使用C#和.net。


我有很大的遗留文件存储各种值(整数,字节,字符串),并希望将它们读入

a C#程序,以便我可以存储它们在数据库中。这些文件是由1980年代后期的PC

Pascal程序编写的,我没有源代码。我设法对文件进行反向工程

格式。


字符串在文件中存储为Ascii,第一个字节表示字符串长度,和

其余的是Ascii(即8位)字符。
Hi,

I''m a beginner is using C# and .net.

I have big legacy files that stores various values (ints, bytes, strings) and want to read them into
a C# programme so that I can store them in a database. The files are written by a late 1980''s PC
Pascal programme, for which I don''t have the source code. I''ve managed to reverse engineer the file
format.

The strings are stored as Ascii in the file, with the first byte indicating the string length, and
the rest are the Ascii (ie 8-bit) characters.



是的,这就是字符串在Pascal中的存储方式。

Yes, that''s how strings are stored in Pascal.


字符串长度始终为0,20或40个字符

(从不再)和字符串在必要时用空格字符填充。
The string length is always 0, 20 or 40 characters
(never any more) and strings are end-padded with space characters where necessary.



长度是否包括填充?如果确实如此,你只需要修改字符串
。如果它不是,你必须从字符串的长度计算填充多少填充,并跳过这个字节数。

Does the length include the padding or not? If it does, you just have to
trim the string. If it doesn''t, you have to calculate how much padding
there is from the length of the string, and skip that number of bytes.


快速读取字符串并最后摆脱空间填充的最佳方法是什么?
What is the best way to quickly read a string and get rid of the space padding at the end?



使用ReadByte读取长度,然后使用ReadChars方法获取

字符串。你得到一个Char数组,如果你想要一个字符串,只需从数组中创建一个


Read the length using ReadByte, then use the ReadChars method to get the
string. You get an array of Char, if you want a string just create one
from the array.


要制作

我确定我可以正确阅读它们,我会把它们放在一个文本框中。我假设测试盒中使用的字符串

使用16位字符(unicode?)但我可能在这里错了。当我很高兴我能正确阅读它们时,

我将摆脱文本框并将它们直接存储在数据库中。最好将它作为unicode存储在

数据库中吗?我很想使用Ascii来提高效率。


我正在考虑使用二进制阅读器(_br)从文件中提取。对于

一切都应该没问题,但我不知道如何应对Ascii字符串。
To make
sure I can read them correctly, I''ll put them in a text box. I assume the string used in a test box
uses 16-bit characters (unicode?) but I may be wrong here. When I''m happy I can read them correctly,
I''ll get rid of the text box and store them directly in the database. Is it best to store it in the
database as unicode? I''m tempted to use Ascii for efficiency.

I was thinking of using a binary reader (_br) to extract from the file. That should be fine for
everything, but I don''t know how to cope the the Ascii strings.



是的,BinaryReader正是我建议使用的。


在创建BinaryReader时指定编码,这样它就可以处理读取字符,并且你不必读取字节并解码它们。


如果ASCII码编码不起作用字符串包含扩展字符

(127以上)。使用Encoding.GetEncoding(850)获取

扩展ASCII的编码。

Yes, a BinaryReader is exactly what I would suggest to use.

Specify the encoding when you create the BinaryReader, that way it can
handle reading chars, and you don''t have to read bytes and decode them.

ASCII encoding won''t work if the strings contains extended characters
(above 127). Use Encoding.GetEncoding(850) to get the encoding for
extended ASCII.


这篇关于读一个Ascii字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆