请求对简单的Alex解析器的意见 [英] Request for comments on simple Alex parser

查看:182
本文介绍了请求对简单的Alex解析器的意见的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在寻找Haskell Yi编辑器的贡献代码,我想添加Git提交和rebase模式。我从来没有做过任何与Alex之前,所以我决定写一个提交解析器独立的Yi之外,然后试图添加一个到编辑器。我不能找到许多关于亚历克斯的文档除了在亚历克斯页面上的文档是真正的 light关于monad包装器的信息,似乎是什么项目模拟。

I've been looking at contributing code to the Haskell Yi editor and I want to add Git commit and rebase modes to it. I've never done anything with Alex before so I decided to write a commit parser standalone outside of Yi before trying to add one to to the editor. I couldn't find much documentation on Alex aside from the docs on the Alex page which are really light on information about the monad wrapper which seems to be what the Yi project emulates.

任何人都可以给我发表关于这段代码的错误(希望是正确的)的意见。我对Haskell很新,所以任何评论也会赞赏。这个想法是,这将正确处理消息行,注释和diff行(当你运行 git commit -v 时,这将在上述所有)。我可以添加支持,以区分摘要行和后续行之后,但我想保持简单的现在。

Could anyone give me comments about what's wrong (and hopefully right) about this code? I'm pretty new to Haskell so any comments there would also be appreciated. The idea is that this will correctly handle message lines, comments, and diff lines (which will be under all of the above when you run git commit -v). I may add support for differentiating between the digest line and subsequent lines later but I wanted to stay simple for now.


{
module Main where
}

%wrapper "monad"

$commitChars = [$printable\t]
@diffStart = diff\ \-\-git\ $commitChars*

gitCommit :-

 {
^@diffStart$                   {makeAlexAction DiffDeclaration `andBegin` diff}
^\# $commitChars*$             {makeAlexAction Comment}
$commitChars*$                 {makeAlexAction MessageLine}
}

 {
^@diffStart$                   {makeAlexAction DiffDeclaration}
^\- $commitChars*$             {makeAlexAction DiffRemove}
^\+ $commitChars*$             {makeAlexAction DiffAdd}
^$commitChars*$                {makeAlexAction DiffContext}
}

.                                     ;
[\n\r]                                ;

{
data GitCommitToken = Digest String
                    | MessageLine String
                    | Comment String
                    | DiffDeclaration String
                    | DiffAdd String
                    | DiffRemove String
                    | DiffContext String
                    | CommitEOF
                  deriving (Show, Eq)

makeAlexAction ::Monad m => (String -> GitCommitToken) ->AlexInput ->Int ->m GitCommitToken
makeAlexAction cons = \ (_,_,inp) len ->return $cons (take len inp)

alexEOF = return CommitEOF

alexMonadScanTokens ::Alex [GitCommitToken]
alexMonadScanTokens = do                                                                                 
  inp  alexEOF >>= \eof ->return [eof]
    AlexError inp' ->alexError $ "lexical error: " ++ show inp'
    AlexSkip  inp' len ->do
        alexSetInput inp'
        alexMonadScanTokens
    AlexToken inp' len action ->do
        alexSetInput inp'
        token <- action inp len
        tokens <-alexMonadScanTokens
        return $ token : tokens

main = do
     s <- getContents
     mapM_ print $ either (\_ -> []) id (runAlex s alexMonadScanTokens)
}


推荐答案

贡献Haskell和Yi!

First, thanks for contributing to Haskell, and to Yi!

代码审查


  • 我将使用bytestring解析。 Alex现在支持这个。使用%wrapperstrict-bytestring模式。

  • (\_ - > ])id :有点奇怪。

  • 中有空格泄漏 c> alexMonadScanTokens 最少,它消耗堆栈)。 Alex为你定义 alexScanTokens ,成为输入的左边。我应该使用 alexMonadScan

  • 在重型解析器中,严格的bytestrings为令牌类型。例如。 Digest!ByteString

  • 您的正则表达式看起来不错。

  • I'd use bytestring parsing. Alex supports this now. Use the %wrapper "strict-bytestring" mode.
  • either (\_ -> []) id: a bit odd. I'd use an explicit case, with a better error for parse failure.
  • Does it have a space leak in alexMonadScanTokens (at least, it consumes the stack). Alex defines alexScanTokens for you, to be a left fold over the input. I think you should be using alexMonadScan
  • In a heavy duty parser, I'd use unpacked, strict bytestrings for the token types. E.g. Digest !ByteString
  • Your regexes look fine.

总结,很好先行!切换到bytestring解析,并尝试对一些大文件进行测试,以确保您的扫描器没有空间泄漏,假设您不使用开箱即用 alexScanTokens

Summary, pretty good first go! Switch to bytestring parsing, and try testing on some large files to ensure that your scanner has no space leak, assuming you don't use the out-of-the-box alexScanTokens.

查看 bytestring-lexing 包,用于基于Bytestring的Alex解析器。其次,alex包本身有很多很好的例子,可以帮助惯用的解析:

Look at the bytestring-lexing package for a bytestring-based Alex parser. Secondarily, the alex package itself has many nice examples, that can help with idiomatic parsing:

  $ cabal unpack alex
  $ cd examples

这篇关于请求对简单的Alex解析器的意见的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆