如何创建一个解析器(法/ yacc的)? [英] How to create a parser(lex/yacc)?

查看:293
本文介绍了如何创建一个解析器(法/ yacc的)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下文件,并需要加以解析

I'm having the following file and which need to be parsed

--TestFile
Start ASDF123
Name "John"
Address "#6,US" 
end ASDF123

该行以 - 会注释行处理。并且该文件启动开始和结束结束。在的字符串开始用户名,然后名称地址将双quots内。

The lines start with -- will be treated as comment lines. and the file starts 'Start' and ends with end. The string after Start is the UserID and then the name and address will be inside the double quots.

我需要解析的文件,并写入解析数据成xml文件。

I need to parse the file and write the parsed data into an xml file.

所以生成的文件会像

<ASDF123>
  <Name Value="John" />
  <Address Value="#6,US" />
</ASDF123>

现在我使用模式匹配(正则表达式)来解析上面的文件。这里是我的示例代码

now i'm using pattern matching(Regular Expressions) to parse the above file . Here is my sample code.

    /// <summary>
    /// To Store the row data from the file
    /// </summary>
    List<String> MyList = new List<String>();

    String strName = "";
    String strAddress = "";
    String strInfo = "";



:ReadFile的

Method : ReadFile

    /// <summary>
    /// To read the file into a List
    /// </summary>
    private void ReadFile()
    {
        StreamReader Reader = new StreamReader(Application.StartupPath + "\\TestFile.txt");
        while (!Reader.EndOfStream)
        {
            MyList.Add(Reader.ReadLine());
        }
        Reader.Close();
    }



:FormateRowData

Method : FormateRowData

    /// <summary>
    /// To remove comments 
    /// </summary>
    private void FormateRowData()
    {
        MyList = MyList.Where(X => X != "").Where(X => X.StartsWith("--")==false ).ToList();
    }



:ParseData

Method : ParseData

    /// <summary>
    /// To Parse the data from the List
    /// </summary>
    private void ParseData()
    {
        Match l_mMatch;
        Regex RegData = new Regex("start[ \t\r\n]*(?<Data>[a-z0-9]*)", RegexOptions.IgnoreCase);
        Regex RegName = new Regex("name [ \t\r\n]*\"(?<Name>[a-z]*)\"", RegexOptions.IgnoreCase);
        Regex RegAddress = new Regex("address [ \t\r\n]*\"(?<Address>[a-z0-9 #,]*)\"", RegexOptions.IgnoreCase);
        for (int Index = 0; Index < MyList.Count; Index++)
        {
            l_mMatch = RegData.Match(MyList[Index]);
            if (l_mMatch.Success)
                strInfo = l_mMatch.Groups["Data"].Value;
            l_mMatch = RegName.Match(MyList[Index]);
            if (l_mMatch.Success)
                strName = l_mMatch.Groups["Name"].Value;
            l_mMatch = RegAddress.Match(MyList[Index]);
            if (l_mMatch.Success)
                strAddress = l_mMatch.Groups["Address"].Value;
        }
    }



:WriteFile的

Method : WriteFile

    /// <summary>
    /// To write parsed information into file.
    /// </summary>
    private void WriteFile()
    {
        XDocument XD = new XDocument(
                           new XElement(strInfo,
                                         new XElement("Name",
                                             new XAttribute("Value", strName)),
                                         new XElement("Address",
                                             new XAttribute("Value", strAddress))));
        XD.Save(Application.StartupPath + "\\File.xml");
    }



我听说过的 ParserGenerator

请帮我写使用lex和yacc解析器。这样做的原因是,我exsisting解析器(模式匹配)不灵活,更在其不正确的做法(我是这么认为的)。

Please help me to write a parser using lex and yacc. The reason for this is , my exsisting parser(Pattern Matching) is not flexible, more over its not the right way(I think so).

如何我使用的 ParserGenerator (我读过的代码项目的一个示例并的代码项目样品在两个但我依然不熟悉)。请建议我一些分析器产生器,输出C#解析器。

How to i make use of the ParserGenerator(I've read Code Project Sample One and Code Project Sample Two but still i'm not familiar with this). Please suggest me some parser generator which outputs C# parsers.

推荐答案

花园点LEX 花园点解析器发电机的强烈影响通过LEX和YACC和输出的C#代码。

Gardens Point LEX and the Gardens Point Parser Generator are strongly influenced by LEX and YACC, and output C# code.

您的语法是不够的,我认为你目前的做法是好的,简单,但想学习真正的方式荣誉的这样做。 :-)因此,这里是我的语法(只是生产规则的建议;这远不是一个完整的例子,实际GPPG文件需要通过C#来代替 ... 。构建语法树的代码,你需要令牌声明等 - 阅读文档中的GPPG例子而且你还需要描述令牌)的GPLEX文件:

Your grammar is simple enough that I think your current approach is fine, but kudos for wanting to learn the "real" way of doing it. :-) So here's my suggestion for a grammar (just the production rules; this is far from a full example. The actual GPPG file needs to replace the ... by C# code for building the syntax tree, and you need token declarations etc. - read the GPPG examples in the documentation. And you also need the GPLEX file that describes the tokens):

/* Your input file is a list of "top level elements" */
TopLevel : 
    TopLevel TopLevelElement { ... }
    | /* (empty) */

/* A top level element is either a comment or a block. 
   The COMMENT token must be described in the GPLEX file as 
   any line that starts with -- . */
TopLevelElement:
    Block { ... }
    | COMMENT { ... }

/* A block starts with the token START (which, in the GPLEX file, 
   is defined as the string "Start"), continues with some identifier 
   (the block name), then has a list of elements, and finally the token
   END followed by an identifier. If you want to validate that the
   END identifier is the same as the START identifier, you can do that
   in the C# code that analyses the syntax tree built by GPPG.
   The token Identifier is also defined with a regular expression in GPLEX. */
Block:
    START Identifier BlockElementList END Identifier { ... }

BlockElementList:
    BlockElementList BlockElement { ... }
    | /* empty */

BlockElement:
    (NAME | ADDRESS) QuotedString { ... }

这篇关于如何创建一个解析器(法/ yacc的)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆