解析程序的新手代码审查请 [英] Newbie code review of parsing program Please

查看：82 发布时间：2019/6/6 18:55:29 python

本文介绍了解析程序的新手代码审查请的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我创建了以下程序来读取一个文本文件，该文件发生在一个cobol提交的定义中。
。程序然后输出到一个文件

什么本质上是一个文件，这是一个列表定义，我可以

以后

复制并过去进入python程序。我最终将扩展

程序

也输出一个SQL脚本来在MySQL中创建一个SQL文件

该程序仍然需要一点工作，它不处理以下

项目

尚未;

1.它还没有处理OCCURS 。

2.它还没有处理REDEFINE。

3. GROUP结构需要工作。

4.还没有创建SQL脚本。

我期待用这个程序创建的任何文件都可以

需要

手动推文但我有一个大的cobol文件定义的数量

我可能需要使用这个似乎比

手更好的解决方案

手动输入每个列表定义和SQL创建文件脚本。

我想要的是，如果有些善良的人可以查看我的代码并给出

我

关于如何改进它的一些建议。我认为使用

常规

表达式可能会减少代码或者至少简化解析

但是

我刚开始阅读书中的那些章节;）

***样本输入文件***

000100 FD SALESMEN-FILE

000200标签记录标准

000300 FILENAME的价值是SALESMEN。

000400

000500 01 SALESMEN-RECORD。

000600 05 SALESMEN-NO PIC 9（3）。

000700 05 SALESMEN-NAME PIC X（30）。

000800 05 SALESMEN-TERRITORY PIC X（30）。

000900 05 SALESMEN-QUOTA PIC S9（7）COMP。

001000 05 SALESMEN-1ST-BONUS PIC S9（5）V99 COMP。

001100 05 SALESMEN-2ND-BONUS PIC S9（5）V99 COMP。

001200 05 SALESMEN-3RD-BONUS PIC S9（5 ）V99 COMP。

001300 05 SALESMEN-4TH-BONUS PIC S9（5）V 99 COMP。

***节目代码***

＃！/ usr / bin / python

import sys

f_path =''/ home / lenyel / Bruske / MCBA / Internet /''

f_name = sys.argv [ 1]

fd = open（f_path + f_name，''r''）

def fmtline（fieldline）：

size =''''

type =''''

dec =''''

codeline = []

if fieldline.count（''COMP。''）0：

left = fieldline [3] .find（''（''）+ 1

right = fieldline [3] .find（''）''）

num = fieldline [3] [left：right] .lstrip（）

如果fieldline [3] .count（''V''）：

left = fieldline [3] .find（''V''）+ 1

dec = int（len（fieldline [3] [left：]））

size =（（int（num）+ int（dec））/ 2）+ 1

否则：

size =（int（num）/ 2）+ 1

dec = 0

type =''Pdec''

elif fieldline [3] [0] in（''X ''，''9''）：

dec = 0

left = fieldline [3] .find（''（''）+ 1

right = fieldline [3] .find（''）''）

size = int（fieldline [3] [left：right] .lstrip（''0''））

如果fieldline [3] [0] ==''X''：

type =''Xstr''

else：

type =''Xint''

else：

dec = 0

left = fieldline [3]。 find（''（''）+ 1

right = fieldline [3] .find（''）''）

size = int（fieldline [3] [ left：right] .lstrip（''0''））

if fieldline [3] [0] ==''X''：

type ='' Xint''

codeline.append（fieldline [1] .replace（'' - ''，''_''）。replace（'''''，

'''）。lower（））

codeline.append（大小）

codeline.append（类型）

codeline.append （dec）

返回代码行

wrkfd = []

rec_len = 0

$ b fd中的行为$ b：

如果行[6] ==''*''：＃drop comment lines

continue

newline = line.split（）

如果len（换行符）== 1：#drop空行

继续

换行=换行符[1：]

如果换行中有''FILENAME''：

filename =换行符[-1] .replace（''"''，''''）。lower（）

filename = filename.replace（''。''，''''）

output = open（''/ home / lenyel / Bruske / MCBA / Internet /''+ filename

+''。fd''，''w''）

code = filename +''= [\ n''

输出。写（代码）

elif换行符[0] .isdigit（）和换行符'PIC''：

wrkfd.append（fmtline（换行符））

rec_len + = wrkfd [-1] [1]

fd.close（）

fmtfd = []

$ w $ b wrkfd中的wrkline [： - 1]：

fmtline = str（元组（wrkline））+''，\ n''

output.write（fmtline）

fmtline =元组（wrkfd [-1]）

fmtline = str（fmtline）+''\ n''

output.write（fmtline）

lastline =''] \ n''

output.write（lastline）

lenrec = filename +''_len =''+ str（rec_len）

output.write（lenrec）

output.close（）

***结果输出***

salesmen = [

（''salesmen_no''，3，''Xint''，0），

（''salesmen_name''，30，''Xstr''，0），

（''salesmen_territory''，30，''Xstr''，0 ），

（''salesmen_quota''，4，''Pdec''，0），

（''salesmen_1st_bonus''，4，''Pdec''' ，2），

（''salesmen_2nd_bonus''，4，''Pdec''，2），

（''salesmen_3rd_bonus''，4，''Pdec ''，2），

（''salesmen_4th_bonus''，4，''Pdec''，2）

]

salesmen_len = 83

如果您发现此代码有用，请随意使用其中任何一个或全部

，风险自负。

谢谢

Len S

解决方案

" len" < ls ****** @ gmail.com写信息

新闻：fc ************************ ********** @ u18g2000 pro.googlegroups.com ...

>我创建了以下程序来读取文本文件发生

是cobol提交的定义。程序然后输出到一个文件

什么本质上是一个文件，这是一个列表定义，我可以

以后

复制并过去进入python程序。我最终将扩展

程序

也输出一个SQL脚本来在MySQL中创建一个SQL文件

该程序仍然需要一点工作，它不处理以下

项目

尚未;

1.它还没有处理OCCURS 。

2.它还没有处理REDEFINE。

3. GROUP结构需要工作。

4.还没有创建SQL脚本。

我期待用这个程序创建的任何文件都可以

需要

手动推文但我有一个大的cobol文件定义的数量

我可能需要使用这个似乎比

手更好的解决方案

手动输入每个列表定义和SQL创建文件脚本。

我想要的是，如果有些善良的人可以查看我的代码并给出

我

关于如何改进它的一些建议。我认为使用

常规

表达式可能会减少代码或者至少简化解析

但是

我刚开始阅读书中的那些章节;）

***样本输入文件***

000100 FD SALESMEN-FILE

000200标签记录标准

000300 FILENAME的价值是SALESMEN。

000400

000500 01 SALESMEN-RECORD。

000600 05 SALESMEN-NO PIC 9（3）。

000700 05 SALESMEN-NAME PIC X（30）。

000800 05 SALESMEN-TERRITORY PIC X（30）。

000900 05 SALESMEN-QUOTA PIC S9（7）COMP。

001000 05 SALESMEN-1ST-BONUS PIC S9（5）V99 COMP。

001100 05 SALESMEN-2ND-BONUS PIC S9（5）V99 COMP。

001200 05 SALESMEN-3RD-BONUS PIC S9（5 ）V99 COMP。

001300 05 SALESMEN-4TH-BONUS PIC S9（5）V99 COMP。

***节目代码***

＃！/ usr / bin / python

import sys

f_path =''/ home / lenyel / Bruske / MCBA / Internet /''

f_name = sys.argv [1 ]

fd = open（f_path + f_name，''r''）

def fmtline（fieldline）：

size =''''

type =''''

dec =''''

codeline = []

if fieldline.count（''COMP。''）0：

left = fieldline [3] .find（''（''）+ 1

right = fieldline [3] .find（''）''）

num = fieldline [3] [left：right] .lstrip（）

如果fieldline [3] .count（''V''）：

left = fieldline [3] .find（''V''）+ 1

dec = int（len（fieldline [3] [left：]））

size =（（int（num）+ int（dec））/ 2）+ 1

else ：

size =（int（num）/ 2）+ 1

dec = 0

type =''Pdec''

elif fieldline [3] [0] in（''X''，'9''）：

dec = 0

left = fieldline [3] .find（''（''）+ 1

right = fieldline [3]。 find（''）''）

size = int（fieldline [3] [left：right] .lstrip（''0''））

if fieldline [ 3] [0] ==''X''：

type =''Xstr''

else：

type ='' Xint''

else：

dec = 0

left = fieldline [3] .find（''（''）+ 1

right = fieldline [3] .find（''）''）

size = int（fieldline [3] [left：right] .lstrip（''0' '））

如果fieldline [3] [0] ==''X''：

type =''Xint''

codeline.append（fieldline [1] .replace（'' - ''，''_''）。replace（''。''，

''''）。 lower（））

codeline.append（size）

codeline.append（type）

codeline.append（dec）

返回代码行

wrkfd = []

rec_len = 0
对于fd中的行，
：

如果行[6] ==''*''：＃drop comment lines

continue

newline = line.split（）

if len（换行符）== 1：#drop blank line

continue

newline =换行符[1：]

如果换行中有''FILENAME''：

filename =换行符[-1] .replace（''"''，''''）。lower（）

filename = filename.replace（''。''，''''）

output = open（''/ home / lenyel / Bruske / MCBA / Internet /''+ filename

+''。fd''，''w''）

code = filename +''= [\ n''

output.write（code）

elif newline [0] .isdigit（）和'PIC''换行：

wrkfd.append（fmtline（换行））

rec_len + = wrkfd [-1] [1]

fd.close（）

fmtfd = []

$ w $ b for wrkf in wrkfd [： - 1]：

fmtline = str （元组（wrkline））+''，\ n''

output.write（fmtline）

fmtline =元组（wrkfd [-1] ）

fmtline = str（fmtline）+''\ n''

output.write（fmtline）

lastline =''] \ n''

output.write（lastline）

lenrec = filename +''_ alen =''+ str（rec_len）

output.write（lenrec）

output.close（）

***结果输出** *

salesmen = [

（''salesmen_no''，3，''Xint''，0），

（''salesmen_name''，30，''Xstr''，0），

（''salesmen_territory''，30，''Xstr''，0），

（''salesmen_quota''，4，''Pdec''，0），

（''销售额men_1st_bonus''，4，''Pdec''，2），

（''salesmen_2nd_bonus''，4，''Pdec''，2），

（ ''salesmen_3rd_bonus''，4，''Pdec''，2），

（''salesmen_4th_bonus''，4，''Pdec''，2）

]

salesmen_len = 83

如果您觉得此代码有用，请随意使用其中任何一个或全部

自己承担风险。

谢谢

Len S

你可能想查看pyparsing图书馆。

-Mark

11月16日，12：40 * pm，Mark Tolonen < M8R-yft ... @ mailinator.comwrote：

" len" < lsumn ... @ gmail.com在留言中写道

新闻：fc *********************** *********** @ u18g2000 pro.googlegroups.com ...

我创建了以下内容程序读取发生的文本文件

是cobol提交的定义。 *程序然后输出到一个文件

什么本质上是一个文件，这是一个列表定义，我可以

以后

复制并过去一个python程序。 *我最终会扩展

程序

也输出一个SQL脚本来在MySQL中创建一个SQL文件

该程序仍然需要一点工作，它不处理以下

项目

尚未;

1. *它还没有处理OCCURS。

2. *它还没有处理REDEFINE。 />
3. * GROUP结构需要工作。

4. *尚未创建SQL脚本。

我期待用这个程序创建的任何文件可能需要

需要

手动tweeking但我有大量的cobol文件定义

我可能需要使用这个似乎比
$ b $更好的解决方案b hand

手动输入每个列表定义和SQL创建文件脚本。

我想要的是一个善良的灵魂可以查看我的代码并给予

我

关于如何改进它的一些建议。 *我认为使用

常规

表达式可能会减少代码或者至少简化解析

但

我刚刚开始阅读书中的那些章节;）

***样本输入文件***

000100 FD * SALESMEN-FILE

000200 * *标签记录标准

000300 * *价值FILENAME是SALESMEN。

000400

000500 01 * SALESMEN-RECORD。

000600 * * 05 * SALESMEN-NO * * * * * * * * PIC 9（3）。

000700 * * 05 * SALESMEN-NAME * * * * * * * PIC X（30）..

000800 * * 05 * SALESMEN-TERRITORY * * * * PIC X（30）。

000900 * * 05 * SALESMEN-QUOTA * * * * * * PIC S9（7）COMP。

001000 * * 05 * SALESMEN-1ST-BONUS * * * * PIC S9（5）V99 COMP。

001100 * * 05 * SALESMEN-2ND-BONUS * * * * PIC S9 （5）V99 COMP。

001200 * * 05 * SA LESMEN-3RD-BONUS * * * * PIC S9（5）V99 COMP。

001300 * * 05 * SALESMEN-4TH-BONUS * * * * PIC S9（5）V99 COMP。

***计划代码***

＃！ / usr / bin / python

import sys

f_path = ''/ home / lenyel / Bruske / MCBA / Internet /''

f_name = sys.argv [1]

fd = open（f_path + f_name，''r''）

def fmtline（fieldline）：

* * size =''''

* * type =''''

* * dec =''''

* * codeline = []

* * if fieldline.count（''COMP。''）0：

* * * * left = fieldline [3] .find（'' （''）+ 1

* * * * right = fieldline [3] .find（''）''）

* * * * num = fieldline [3 ] [left：right] .lstrip（）

* * * * if fieldline [3] .count（' V''）：

* * * * * * left = fieldline [3] .find（''V''）+ 1

* * * * * * dec = int（len（fieldline [3] [left：]））

* * * * * * size =（（int（num）+ int（dec））/ 2）+ 1

* * * *其他：

* * * * * * size =（int（num）/ 2）+ 1

* * * * * * dec = 0

* * * * type =''Pdec''

* * elif fieldline [3] [0] in（'''X'' ，''9''）：

* * * * dec = 0

* * * * left = fieldline [3] .find（''（''） + 1

* * * * right = fieldline [3] .find（''）''）

* * * * size = int（fieldline [3] [ left：right] .lstrip（''0''））

* * * * if fieldline [3] [0] ==''X''：

* * * * * * type =''Xstr''

* * * * else：

* * * * * * type =''Xint''

* *其他：

* * * * dec = 0

* * * * left = fieldline [3] .find（''（'' '）+ 1

* * * * right = fieldline [3] .find（''）''）

* * * * size = int（fieldline [3] [left：right] .lstrip（''0''））

* * * * if fieldline [3] [0] = =''X''：

* * * * * * type =''Xint''

* * codeline.append（fieldline [1] .replace（' ' - ''，''_''）。replace（''。''，

''''）.lower（））

* *代码行。追加（大小）

* * codeline.append（类型）

* * codeline.append（dec）

* *返回代码行

wrkfd = []

rec_len = 0

表示行中的行：

* *如果行[6] ==''*''：* * * #drop评论行

* * * *继续

* * newline = line.split（）

* * if len（换行符）== 1：*＃drop blank line

* * * *继续

* *换行=换行[1：]

* *如果换行中有''FILENAME''：

* * * * filename = newline [-1] .replace（''&quo t;''，''''）。lower（）

* * * * filename = filename.replace（''。''，''''）

* * * * output = open（''/ home / lenyel / Bruske / MCBA / Internet /''+ filename

+''。fd''，''w''）

* * * * code = filename +''= [\ n''

* * * * output.write（code）

* * elif newline [0] .isdigit（）和'PIC''换行：

* * * * wrkfd.append（fmtline（换行））

* * * * rec_len + = wrkfd [-1] [1]

fd.close（）

fmtfd = []

for wrkline in wrkfd [： - 1]：

* * fmtline = str（元组（wrkline））+''，\ n''

* * output.write（fmtline）

fmtline =元组（wrkfd [-1]）

fmtline = str（fmtline）+''\ n''

output.write（fmtline）

lastline =''] \ n''

输出。 write（lastline）

lenrec = filename +''_ alen =''+ str（rec_len）

output.write （lenrec）

output.close（）

** *结果输出***

salesmen = [

（''salesmen_no''，3，''Xint' '，0），

（''salesmen_name''，30，''Xstr''，0），

（''salesmen_territory''，30，''' Xstr''，0），

（''salesmen_quota''，4，''Pdec''，0），

（''salesmen_1st_bonus''，4， ''Pdec''，2），

（''salesmen_2nd_bonus''，4，''Pdec''，2），

（''salesmen_3rd_bonus' ，4，''Pdec''，2），

（''salesmen_4th_bonus''，4，''Pdec''，2）

]

salesmen_len = 83

如果您觉得此代码有用，请随意使用其中的任何一个或全部

风险自负。

谢谢

Len S

你可能想要查看pyparsing图书馆。

-Mark

谢谢马克我现在将办理登机手续。

Len

Mark Tolonen写道：

>

" len" < ls ****** @ gmail.com写信息

新闻：fc ************************ ********** @ u18g2000 pro.googlegroups.com ...

[...]

>

你可能想查看一个pyparsing库。

你可能想修剪你的消息，以避免引用无关的

的东西。这不是针对马克，而是针对所有读者。

我们这样做，我希望我们能够阻止它。这是一个很糟糕的网络礼节，因为它会迫使人们跳过那些与点b / b
无关的东西。这也是带宽和存储空间的全球性，虽然它不像以前那么重要。

关于

Steve

-

Steve Holden +1 571 484 6266 +1 800 494 3119

Holden Web LLC http://www.holdenweb.com/

I have created the following program to read a text file which happens
to be a cobol filed definition. The program then outputs to a file
what is essentially a file which is a list definition which I can
later
copy and past into a python program. I will eventually expand the
program
to also output an SQL script to create a SQL file in MySQL

The program still need a little work, it does not handle the following
items
yet;

1. It does not handle OCCURS yet.
2. It does not handle REDEFINE yet.
3. GROUP structures will need work.
4. Does not create SQL script yet.

It is my anticipation that any files created out of this program may
need
manual tweeking but I have a large number of cobol file definitions
which
I may need to work with and this seemed like a better solution than
hand
typing each list definition and SQL create file script by hand.

What I would like is if some kind soul could review my code and give
me
some suggestions on how I might improve it. I think the use of
regular
expression might cut the code down or at least simplify the parsing
but
I''m just starting to read those chapters in the book;)

*** SAMPLE INPUT FILE ***

000100 FD SALESMEN-FILE
000200 LABEL RECORDS ARE STANDARD
000300 VALUE OF FILENAME IS "SALESMEN".
000400
000500 01 SALESMEN-RECORD.
000600 05 SALESMEN-NO PIC 9(3).
000700 05 SALESMEN-NAME PIC X(30).
000800 05 SALESMEN-TERRITORY PIC X(30).
000900 05 SALESMEN-QUOTA PIC S9(7) COMP.
001000 05 SALESMEN-1ST-BONUS PIC S9(5)V99 COMP.
001100 05 SALESMEN-2ND-BONUS PIC S9(5)V99 COMP.
001200 05 SALESMEN-3RD-BONUS PIC S9(5)V99 COMP.
001300 05 SALESMEN-4TH-BONUS PIC S9(5)V99 COMP.

*** PROGRAM CODE ***

#!/usr/bin/python

import sys

f_path = ''/home/lenyel/Bruske/MCBA/Internet/''
f_name = sys.argv[1]

fd = open(f_path + f_name, ''r'')

def fmtline(fieldline):
size = ''''
type = ''''
dec = ''''
codeline = []
if fieldline.count(''COMP.'') 0:
left = fieldline[3].find(''('') + 1
right = fieldline[3].find('')'')
num = fieldline[3][left:right].lstrip()
if fieldline[3].count(''V''):
left = fieldline[3].find(''V'') + 1
dec = int(len(fieldline[3][left:]))
size = ((int(num) + int(dec)) / 2) + 1
else:
size = (int(num) / 2) + 1
dec = 0
type = ''Pdec''
elif fieldline[3][0] in (''X'', ''9''):
dec = 0
left = fieldline[3].find(''('') + 1
right = fieldline[3].find('')'')
size = int(fieldline[3][left:right].lstrip(''0''))
if fieldline[3][0] == ''X'':
type = ''Xstr''
else:
type = ''Xint''
else:
dec = 0
left = fieldline[3].find(''('') + 1
right = fieldline[3].find('')'')
size = int(fieldline[3][left:right].lstrip(''0''))
if fieldline[3][0] == ''X'':
type = ''Xint''
codeline.append(fieldline[1].replace(''-'', ''_'').replace(''.'',
'''').lower())
codeline.append(size)
codeline.append(type)
codeline.append(dec)
return codeline

wrkfd = []
rec_len = 0

for line in fd:
if line[6] == ''*'': # drop comment lines
continue
newline = line.split()
if len(newline) == 1: # drop blank line
continue
newline = newline[1:]
if ''FILENAME'' in newline:
filename = newline[-1].replace(''"'','''').lower()
filename = filename.replace(''.'','''')
output = open(''/home/lenyel/Bruske/MCBA/Internet/''+filename
+''.fd'', ''w'')
code = filename + '' = [\n''
output.write(code)
elif newline[0].isdigit() and ''PIC'' in newline:
wrkfd.append(fmtline(newline))
rec_len += wrkfd[-1][1]

fd.close()

fmtfd = []

for wrkline in wrkfd[:-1]:
fmtline = str(tuple(wrkline)) + '',\n''
output.write(fmtline)

fmtline = tuple(wrkfd[-1])
fmtline = str(fmtline) + ''\n''
output.write(fmtline)

lastline = '']\n''
output.write(lastline)

lenrec = filename + ''_len = '' + str(rec_len)
output.write(lenrec)

output.close()

*** RESULTING OUTPUT ***

salesmen = [
(''salesmen_no'', 3, ''Xint'', 0),
(''salesmen_name'', 30, ''Xstr'', 0),
(''salesmen_territory'', 30, ''Xstr'', 0),
(''salesmen_quota'', 4, ''Pdec'', 0),
(''salesmen_1st_bonus'', 4, ''Pdec'', 2),
(''salesmen_2nd_bonus'', 4, ''Pdec'', 2),
(''salesmen_3rd_bonus'', 4, ''Pdec'', 2),
(''salesmen_4th_bonus'', 4, ''Pdec'', 2)
]
salesmen_len = 83

If you find this code useful please feel free to use any or all of it
at your own risk.

Thanks
Len S

解决方案

"len" <ls******@gmail.comwrote in message
news:fc**********************************@u18g2000 pro.googlegroups.com...
>I have created the following program to read a text file which happens
to be a cobol filed definition. The program then outputs to a file
what is essentially a file which is a list definition which I can
later
copy and past into a python program. I will eventually expand the
program
to also output an SQL script to create a SQL file in MySQL

The program still need a little work, it does not handle the following
items
yet;

1. It does not handle OCCURS yet.
2. It does not handle REDEFINE yet.
3. GROUP structures will need work.
4. Does not create SQL script yet.

It is my anticipation that any files created out of this program may
need
manual tweeking but I have a large number of cobol file definitions
which
I may need to work with and this seemed like a better solution than
hand
typing each list definition and SQL create file script by hand.

What I would like is if some kind soul could review my code and give
me
some suggestions on how I might improve it. I think the use of
regular
expression might cut the code down or at least simplify the parsing
but
I''m just starting to read those chapters in the book;)

*** SAMPLE INPUT FILE ***

000100 FD SALESMEN-FILE
000200 LABEL RECORDS ARE STANDARD
000300 VALUE OF FILENAME IS "SALESMEN".
000400
000500 01 SALESMEN-RECORD.
000600 05 SALESMEN-NO PIC 9(3).
000700 05 SALESMEN-NAME PIC X(30).
000800 05 SALESMEN-TERRITORY PIC X(30).
000900 05 SALESMEN-QUOTA PIC S9(7) COMP.
001000 05 SALESMEN-1ST-BONUS PIC S9(5)V99 COMP.
001100 05 SALESMEN-2ND-BONUS PIC S9(5)V99 COMP.
001200 05 SALESMEN-3RD-BONUS PIC S9(5)V99 COMP.
001300 05 SALESMEN-4TH-BONUS PIC S9(5)V99 COMP.

*** PROGRAM CODE ***

#!/usr/bin/python

import sys

f_path = ''/home/lenyel/Bruske/MCBA/Internet/''
f_name = sys.argv[1]

fd = open(f_path + f_name, ''r'')

def fmtline(fieldline):
size = ''''
type = ''''
dec = ''''
codeline = []
if fieldline.count(''COMP.'') 0:
left = fieldline[3].find(''('') + 1
right = fieldline[3].find('')'')
num = fieldline[3][left:right].lstrip()
if fieldline[3].count(''V''):
left = fieldline[3].find(''V'') + 1
dec = int(len(fieldline[3][left:]))
size = ((int(num) + int(dec)) / 2) + 1
else:
size = (int(num) / 2) + 1
dec = 0
type = ''Pdec''
elif fieldline[3][0] in (''X'', ''9''):
dec = 0
left = fieldline[3].find(''('') + 1
right = fieldline[3].find('')'')
size = int(fieldline[3][left:right].lstrip(''0''))
if fieldline[3][0] == ''X'':
type = ''Xstr''
else:
type = ''Xint''
else:
dec = 0
left = fieldline[3].find(''('') + 1
right = fieldline[3].find('')'')
size = int(fieldline[3][left:right].lstrip(''0''))
if fieldline[3][0] == ''X'':
type = ''Xint''
codeline.append(fieldline[1].replace(''-'', ''_'').replace(''.'',
'''').lower())
codeline.append(size)
codeline.append(type)
codeline.append(dec)
return codeline

wrkfd = []
rec_len = 0

for line in fd:
if line[6] == ''*'': # drop comment lines
continue
newline = line.split()
if len(newline) == 1: # drop blank line
continue
newline = newline[1:]
if ''FILENAME'' in newline:
filename = newline[-1].replace(''"'','''').lower()
filename = filename.replace(''.'','''')
output = open(''/home/lenyel/Bruske/MCBA/Internet/''+filename
+''.fd'', ''w'')
code = filename + '' = [\n''
output.write(code)
elif newline[0].isdigit() and ''PIC'' in newline:
wrkfd.append(fmtline(newline))
rec_len += wrkfd[-1][1]

fd.close()

fmtfd = []

for wrkline in wrkfd[:-1]:
fmtline = str(tuple(wrkline)) + '',\n''
output.write(fmtline)

fmtline = tuple(wrkfd[-1])
fmtline = str(fmtline) + ''\n''
output.write(fmtline)

lastline = '']\n''
output.write(lastline)

lenrec = filename + ''_len = '' + str(rec_len)
output.write(lenrec)

output.close()

*** RESULTING OUTPUT ***

salesmen = [
(''salesmen_no'', 3, ''Xint'', 0),
(''salesmen_name'', 30, ''Xstr'', 0),
(''salesmen_territory'', 30, ''Xstr'', 0),
(''salesmen_quota'', 4, ''Pdec'', 0),
(''salesmen_1st_bonus'', 4, ''Pdec'', 2),
(''salesmen_2nd_bonus'', 4, ''Pdec'', 2),
(''salesmen_3rd_bonus'', 4, ''Pdec'', 2),
(''salesmen_4th_bonus'', 4, ''Pdec'', 2)
]
salesmen_len = 83

If you find this code useful please feel free to use any or all of it
at your own risk.

Thanks
Len S
You might want to check out the pyparsing library.

-Mark

On Nov 16, 12:40*pm, "Mark Tolonen" <M8R-yft...@mailinator.comwrote:
"len" <lsumn...@gmail.comwrote in message

news:fc**********************************@u18g2000 pro.googlegroups.com...

I have created the following program to read a text file which happens
to be a cobol filed definition. *The program then outputs to a file
what is essentially a file which is a list definition which I can
later
copy and past into a python program. *I will eventually expand the
program
to also output an SQL script to create a SQL file in MySQL

The program still need a little work, it does not handle the following
items
yet;

1. *It does not handle OCCURS yet.
2. *It does not handle REDEFINE yet.
3. *GROUP structures will need work.
4. *Does not create SQL script yet.

It is my anticipation that any files created out of this program may
need
manual tweeking but I have a large number of cobol file definitions
which
I may need to work with and this seemed like a better solution than
hand
typing each list definition and SQL create file script by hand.

What I would like is if some kind soul could review my code and give
me
some suggestions on how I might improve it. *I think the use of
regular
expression might cut the code down or at least simplify the parsing
but
I''m just starting to read those chapters in the book;)

*** SAMPLE INPUT FILE ***

000100 FD *SALESMEN-FILE
000200 * * LABEL RECORDS ARE STANDARD
000300 * * VALUE OF FILENAME IS "SALESMEN".
000400
000500 01 *SALESMEN-RECORD.
000600 * * 05 *SALESMEN-NO * * * * * * * *PIC 9(3).
000700 * * 05 *SALESMEN-NAME * * * * * * *PIC X(30)..
000800 * * 05 *SALESMEN-TERRITORY * * * * PIC X(30).
000900 * * 05 *SALESMEN-QUOTA * * * * * * PIC S9(7) COMP.
001000 * * 05 *SALESMEN-1ST-BONUS * * * * PIC S9(5)V99 COMP.
001100 * * 05 *SALESMEN-2ND-BONUS * * * * PIC S9(5)V99 COMP.
001200 * * 05 *SALESMEN-3RD-BONUS * * * * PIC S9(5)V99 COMP.
001300 * * 05 *SALESMEN-4TH-BONUS * * * * PIC S9(5)V99 COMP.

*** PROGRAM CODE ***

#!/usr/bin/python

import sys

f_path = ''/home/lenyel/Bruske/MCBA/Internet/''
f_name = sys.argv[1]

fd = open(f_path + f_name, ''r'')

def fmtline(fieldline):
* *size = ''''
* *type = ''''
* *dec = ''''
* *codeline = []
* *if fieldline.count(''COMP.'') 0:
* * * *left = fieldline[3].find(''('') + 1
* * * *right = fieldline[3].find('')'')
* * * *num = fieldline[3][left:right].lstrip()
* * * *if fieldline[3].count(''V''):
* * * * * *left = fieldline[3].find(''V'') + 1
* * * * * *dec = int(len(fieldline[3][left:]))
* * * * * *size = ((int(num) + int(dec)) / 2) + 1
* * * *else:
* * * * * *size = (int(num) / 2) + 1
* * * * * *dec = 0
* * * *type = ''Pdec''
* *elif fieldline[3][0] in (''X'', ''9''):
* * * *dec = 0
* * * *left = fieldline[3].find(''('') + 1
* * * *right = fieldline[3].find('')'')
* * * *size = int(fieldline[3][left:right].lstrip(''0''))
* * * *if fieldline[3][0] == ''X'':
* * * * * *type = ''Xstr''
* * * *else:
* * * * * *type = ''Xint''
* *else:
* * * *dec = 0
* * * *left = fieldline[3].find(''('') + 1
* * * *right = fieldline[3].find('')'')
* * * *size = int(fieldline[3][left:right].lstrip(''0''))
* * * *if fieldline[3][0] == ''X'':
* * * * * *type = ''Xint''
* *codeline.append(fieldline[1].replace(''-'', ''_'').replace(''.'',
'''').lower())
* *codeline.append(size)
* *codeline.append(type)
* *codeline.append(dec)
* *return codeline

wrkfd = []
rec_len = 0

for line in fd:
* *if line[6] == ''*'': * * *# drop comment lines
* * * *continue
* *newline = line.split()
* *if len(newline) == 1: * # drop blank line
* * * *continue
* *newline = newline[1:]
* *if ''FILENAME'' in newline:
* * * *filename = newline[-1].replace(''"'','''').lower()
* * * *filename = filename.replace(''.'','''')
* * * *output = open(''/home/lenyel/Bruske/MCBA/Internet/''+filename
+''.fd'', ''w'')
* * * *code = filename + '' = [\n''
* * * *output.write(code)
* *elif newline[0].isdigit() and ''PIC'' in newline:
* * * *wrkfd.append(fmtline(newline))
* * * *rec_len += wrkfd[-1][1]

fd.close()

fmtfd = []

for wrkline in wrkfd[:-1]:
* *fmtline = str(tuple(wrkline)) + '',\n''
* *output.write(fmtline)

fmtline = tuple(wrkfd[-1])
fmtline = str(fmtline) + ''\n''
output.write(fmtline)

lastline = '']\n''
output.write(lastline)

lenrec = filename + ''_len = '' + str(rec_len)
output.write(lenrec)

output.close()

*** RESULTING OUTPUT ***

salesmen = [
(''salesmen_no'', 3, ''Xint'', 0),
(''salesmen_name'', 30, ''Xstr'', 0),
(''salesmen_territory'', 30, ''Xstr'', 0),
(''salesmen_quota'', 4, ''Pdec'', 0),
(''salesmen_1st_bonus'', 4, ''Pdec'', 2),
(''salesmen_2nd_bonus'', 4, ''Pdec'', 2),
(''salesmen_3rd_bonus'', 4, ''Pdec'', 2),
(''salesmen_4th_bonus'', 4, ''Pdec'', 2)
]
salesmen_len = 83

If you find this code useful please feel free to use any or all of it
at your own risk.

Thanks
Len S

You might want to check out the pyparsing library.

-Mark
Thanks Mark I will check in out right now.

Len

Mark Tolonen wrote:
>
"len" <ls******@gmail.comwrote in message
news:fc**********************************@u18g2000 pro.googlegroups.com...
[...]
>
You might want to check out the pyparsing library.

And you might want to trim your messages to avoid quoting irrelevant
stuff. This is not directed personally at Mark, but at all readers.

Loads of us do it, and I wish we''d stop it. It''s poor netiquette because
it forces people to skip past stuff that isn''t relevant to the point
being made. It''s also a global wste of bandwidth and storage space,
though that''s less important than it used to be.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

这篇关于解析程序的新手代码审查请的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

解析程序的新手代码审查请 [英] Newbie code review of parsing program Please

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

解析程序的新手代码审查请 [英] Newbie code review of parsing program Please

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭