处理具有多种固定格式的文件的策略 [英] Strategies to handle a file with multiple fixed formats

查看：41 发布时间：2021/6/15 20:16:42 perl

本文介绍了处理具有多种固定格式的文件的策略的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这个问题不是特定于 Perl，(尽管 unpack 函数很可能会影响我的实现.

This question is not Perl-specific, (although the unpack function will most probably figure into my implementation).

我必须处理存在多种格式的文件，以便将数据分层分解为有意义的部分.我希望能够做的是将文件数据解析为合适的数据结构.

I have to deal with files where multiple formats exist to hierarchically break down the data into meaningful sections. What I'd like to be able to do is parse the file data into a suitable data structure.

这是一个例子(RHS 的评论):

Here's an example (commentary on RHS):

                                       # | Format | Level | Comment
                                       # +--------+-------+---------
**DEVICE 109523.69142                  #        1       1   file-specific
  .981    561A                         #        2       1
10/MAY/2010    24.15.30,13.45.03       #        3       2   group of records
05:03:01   AB23X  15.67   101325.72    #        4       3   part of single record
*           14  31.30474 13        0   #        5       3   part of single record
05:03:15   CR22X  16.72   101325.42    #        4       3   new record
*           14  29.16264 11        0   #        5       3
06:23:51   AW41X  15.67    101323.9    #        4       3
*           14  31.26493219        0   #        5       3
11/MAY/2010    24.07.13,13.44.63       #        3       2   group of new records
15:57:14   AB23X  15.67   101327.23    #        4       3   part of single record
*           14  31.30474 13        0   #        5       3   part of single record
15:59:59   CR22X  16.72   101331.88    #        4       3   new record
*           14  29.16264 11        0   #        5

我目前的逻辑很脆弱:

例如，我知道格式 2 总是在格式 1 之后，而且它们只跨越 2 行.
我还知道格式 4 和格式 5 总是成对出现，因为它们对应于一条记录.记录数可能是可变的
我正在使用正则表达式来推断每一行的格式.但是，这是有风险的，并且不会为将来提供灵活性(当有人决定更改输出格式时).

这里的大问题是我可以采用哪些策略来确定哪一行需要使用哪种格式.我很想知道其他人是否也遇到过类似的情况，以及他们采取了哪些措施来解决这个问题.

The big question here is about what strategies I can employ to determine which format needs to be used for which line. I'd be interested to know if others have faced similar situations and what they've done to address it.

处理具有多种固定格式的文件的策略 [英] Strategies to handle a file with multiple fixed formats

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

处理具有多种固定格式的文件的策略 [英] Strategies to handle a file with multiple fixed formats

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭