读取和排序可变长度的CSV文件 [英] reading and sorting a variable length CSV file
问题描述
我们正在使用OpenVMS系统,我相信它正在使用HP的 Cobol 。
We am using OpenVMS system and I believe it is using the Cobol from HP.
带有很多记录(500mb或更多)的数据文件,这些记录可变长度。记录以逗号分隔。我想解析每个记录并提取相应的字段进行处理。之后,我可能想按某些特定字段对其进行排序。可以用cobol吗?
With a data file of a lot of records ( 500mb or more ) which variable length. The records are comma delimited. I would like to parse each records and extract corresponding fields for processing. After that, I might want to sort it by some particular fields. Is it possible with cobol?
我见过只对定长记录进行排序。
I've seen sorting with fixed-length records only.
推荐答案
可变长度没有问题,不确定在VMS cobol中是如何完成的,但是IBMese是:-
Variable length is no problem, not sure exactly how this is done in VMS cobol but the IBMese for this is:-
文件部分。
FD文件记录取决于记录长度。
01记录图片X(5000)。
工作存储部分。< br>
01记录长度图9(5)计算。
FILE SECTION.
FD THE-FILE RECORD IS VARYING DEPENDING ON REC-LENGTH.
01 THE-RECORD PICTURE X(5000) .
WORKING-STORAGE SECTION.
01 REC-LENGTH PICTURE 9(5) COMPUTATIONAL.
读取文件 REC-LENGTH将包含记录长度,写入记录时将写入长度为REC-LENGTH的记录。
When you read the file "REC-LENGTH" will contain the record length, when write a record it will write a record of length REC-LENGTH.
要处理定界的记录文件,您可能需要使用 UNSTRING动词转换为固定格式。这是非常冗长的(但这是COBOL)。
To handle the delimited record files you will probably need to use the "UNSTRING" verb to convert into a fixed format. This is pretty verbose (but then this is COBOL).
UNSTRING record DELIMITED BY ","
INTO field1, field2, field3, field4, field5 etc....
END-UNSTRING
记录是固定格式的,您可以照常使用SORT。
Once the record is in fixed format you can use the SORT as normal.
这篇关于读取和排序可变长度的CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!