读一个COBOL生成的文件 [英] Reading a Cobol generated file

查看:305
本文介绍了读一个COBOL生成的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前在写一个C#应用程序,这是两个现有的应用程序之间会坐的任务。所有我知道关于第二应用是,它处理由第一个生成的文件。第一个应用程序是用Cobol编写的

I’m currently on the task of writing a c# application, which is going sit between two existing apps. All I know about the second application is that it processes files generated by the first one. The first application is written in Cobol.

步骤:
1)COBOL应用程序,写一些文件,并复制到一个目录。
2)第二个应用程序挑选这些文件并处理它们。

Steps: 1) Cobol application, writes some files and copies to a directory. 2) The second application picks these files up and processes them.

我的C#应用​​程序将1之间坐)的2)。它将不得不拿起)1生成的文件,读它,修改它,保存它,这样的应用2)
不会知道我甚至一直在那里。

My C# app would sit between 1) an 2). It would have to pick up the file generated by 1), read it, modify it and save it, so that application 2) wouldn’t know I have even been there.

我有几个问题。


  • 首先,如果我打开记事本)由1生成的文件,最它是不可读的,而其他部分。

  • 如果我读文件,修改和保存,我必须保存由COBOL应用程序所使用的相同的符号文件,从而使应用程序2 ),doesn't知道从来就一直存在。

从来就试图读取文件这种方式,但仍然汇入作业不可读:

I´ve tried reading the file this way, but it´s still unreadable:

代码:

        string ss = @"filename";

        using (FileStream fs = new FileStream(ss, FileMode.Open))
        {
            StreamReader sr = new StreamReader(fs);
            string gg = sr.ReadToEnd();
        }



另外技术),只怕当我再次保存该文件时,我可能会改变原来的汇入作业格式。

Also if I find a way of making it readable (using some sort of encoding technique), I´m afraid that when I save the file again, I may change it´s original format.

有什么想法?建议?

推荐答案

要读取COBOL-genned文件,你需要知道的:

To read the COBOL-genned file, you'll need to know:

首先,你需要在记录布局(字帖)的文件。一个COBOL记录布局会是这个样子:

First, you'll need the record layout (copybook) for the file. A COBOL record layout will look something like this:

01  PATIENT-TREATMENTS.
    05  PATIENT-NAME                PIC X(30).
    05  PATIENT-SS-NUMBER           PIC 9(9).
    05  NUMBER-OF-TREATMENTS        PIC 99 COMP-3.
    05  TREATMENT-HISTORY OCCURS 0 TO 50 TIMES
           DEPENDING ON NUMBER-OF-TREATMENTS
           INDEXED BY TREATMENT-POINTER.
        10  TREATMENT-DATE.
            15  TREATMENT-DAY        PIC 99.
            15  TREATMENT-MONTH      PIC 99.
            15  TREATMENT-YEAR       PIC 9(4).
        10  TREATING-PHYSICIAN       PIC X(30).
        10  TREATMENT-CODE           PIC 99.

您还需要IBM的副本操作原理的(S / 360,S370,z / OS上,并不重要,我们的目的)。最新的是可以从IBM在

You'll also need a copy of IBM's Principles of Operation (S/360, S370, z/OS, doesn't really matter for our purposes). Latest is available from IBM at

  • http://www-01.ibm.com/support/docview.wss?uid=isg2b9de5f05a9d57819852571c500428f9a (but you'll need an IBM account.
  • An older edition is available, gratis, at http://www.hack.org/mc/texts/principles-of-operation.pdf

8章(十进制指令)和9(浮点概述和支持指令)是我们的目的有趣位。

Chapters 8 (Decimal Instructions) and 9 (Floating Point Overview and Support Instructions) are the interesting bits for our purposes.

如果没有这些,你几乎失去了。

Without that, you're pretty much lost.

然后,你需要理解COBOL数据类型,例如:

Then, you need to understand COBOL data types. For instance:


  • PIC定义了一个字母数字格式的场(PIC 9(4),用于例子是4位十进制数,可能如果缺少充满空间字符)。产品图999V99是5个十进制数字,有一个隐含的小数点,因​​此,对等欢喜。

  • BINARY为[通常]签署的定点二进制整数。通常的大小是半字(2个字节)和全字(4个字节)。

  • COMP-1是单精度浮点数。

  • COMP-2双精度浮点运算。

  • PIC defines an alphameric formatted field (PIC 9(4), for example is 4 decimal digits, that might be filled with for space characters if missing). Pic 999V99 is 5 decimal digits, with an implied decimal point. So-on and so forthe.
  • BINARY is [usually] a signed fixed point binary integer. Usual sizes are halfword (2 octets) and fullword (4 octets).
  • COMP-1 is single precision floating point.
  • COMP-2 is double precision floating point.

如果数据源是IBM大型机,COMP-1和COMP-2可能不会IEE浮动点:这将是IBM的 16为基的超额64浮点格式。你需要的东西像的 S / 370操作原理的帮助你了解它。

If the datasource is an IBM mainframe, COMP-1 and COMP-2 likely won't be IEE floating point: it will be IBM's base-16 excess 64 floating point format. You'll need something like the S/370 Principles of Operation to help you understand it.


  • COMP- 3'打包十进制,长短不一。压缩十进制是表示十进制数的一个紧凑的方式。宣言将是这个样子: PIC S9999V99 COMP-3 。这说,这是签署,由6位十进制数有一个隐含的小数点。盒装十进制表示每个十进制数成字节的四位(十六进制值0-9)。高序位是最左边字节的高四位。最右边的八位字节的低四位是代表符号的十六进制值A-F。所以上面的 PIC 条款将要求 CEIL((6 + 1)/ 2)或4个字节。值-345.67,由上述表示PIC 子句将看起来像 0x0034567D 。实际符号值可能会发生变化(默认为C /阳性,D /阴性,但A,C,E和F被视为阳性,而只有B和D为否定处理的)。再次,看的工作的有关细节表现。的S\370原则

  • COMP-3 is 'packed decimal', of varying lengths. Packed decimal is a compact way of representing a decimal number. The declaration will look something like this: PIC S9999V99 COMP-3. This says that is it signed, consists of 6 decimal digits with an implied decimal point. Packed decimal represents each decimal digit as a nibble of an octet (hex values 0-9). The high-order digit is the upper nibble of the leftmost octet. The low nibble of the rightmost octet is a hex value A-F representing the sign. So the above PIC clause will require ceil( (6+1)/2 ) or 4 octets. the value -345.67, as represented by the above PIC clause will look like 0x0034567D. The actual sign value may vary (the default is C/positive, D/negative, but A, C, E and F are treated as positive, while only B and D are treated as negative). Again, see the S\370 Principles of Operation for details on the representation.

相关COMP -3划小数。这可能被宣布为'PIC S9999V99(签署,5个十进制数字,有一个隐含的小数点)。十进制数字,在EBCDIC,是十六进制值0xFO - 0xF9。 解包(主机机器指令)需要一个压缩的十进制领域,变成了成字符字段。这个过程是:

Related to COMP-3 is zoned decimal. This might be declared as `PIC S9999V99' (signed, 5 decimal digits, with an implied decimal point). Decimal digits, in EBCDIC, are the hex values 0xFO - 0xF9. 'Unpack' (mainframe machine instruction) takes a packed decimal field and turns in into a character field. The process is:


  • 先从最右边的八位字节。反其道而行,所以签收四位的顶部,并将其放置到目标领域的最右边的八位字节。

  • 从右至左(源和目标),带材关闭压缩十进制字段的每个其余四位,并将其放置到目的地的下一个可用字节的低四位。用内六角˚F填充高四位。

  • start with the rightmost octet. Invert it, so the sign nibble is on top and place it into the rightmost octet of the destination field.
  • Working from right to left (source and the target both), strip off each remaining nibble of the packed decimal field, and place it into the low nibble of the next available octet in the destination. Fill the high nibble with a hex F.

当在源或目标字段耗尽操作结束。

The operation ends when either the source or destination field is exhausted.

如果目标字段没有用尽,如果左侧填充用十进制填补其余八位零'0'(oxF0)。

If the destination field is not exhausted, if it left-padded with zeroes by filling the remaining octets with decimal '0' (oxF0).

所以我们的例子中值,-345.67,如果存储的默认符号值(十六进制D),会得到解压后为0xF0F0F0F3F4F5F6D7('0003456P',在EBDIC)。

So our example value, -345.67, if stored with the default sign value (hex D), would get unpacked as 0xF0F0F0F3F4F5F6D7 ('0003456P', in EBDIC).

[你去那里。有一个测验后]

[There you go. There's a quiz later]


  1. 如果在COBOL应用程序住在IBM主机上,有文件被从原生EBCDIC转换为ASCII?如果没有,你就必须做映射你的自我(提示:它未必那么简单,因为那看起来,因为这可能是一个选择的过程 - 只有字符型字段被转换(COMP-1,COMP-2,COMP -3和BINARY得到排除,因为它们是二进制八位字节序列)。更糟的是,有EBCDIC表示的多种口味,由于在不同的打印机使用变化的国家实现和不同的打印链。

哦......最后一件事。大型机硬件没事就喜欢上半字,字或双字边界上对齐不同的东西,所以记录布局可能不会直接映射到在该文件中八位位组的,因为可能被插入字段之间填充八比特组以保持所需的字对准

Oh...one last thing. The mainframe hardware tends to like different things aligned on halfword, word or doubleword boundaries, so the record layout may not map directly to the octets in the file as there may be padding octets inserted between fields to maintain the needed word alignment.

好运

这篇关于读一个COBOL生成的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆