Parsec-Parser工作正常,但可以做得更好吗? [英] Parsec-Parser works alright, but could it be done better?

查看:136
本文介绍了Parsec-Parser工作正常,但可以做得更好吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试这样做:

以下面的形式解析文本:


一些文本#{0,0,0}一些文本#{0,0,0}#{0,0,0}更多文本#{0,0,0}


放入一些数据结构列表中:
$ b


[InsideSome Text ,Outside(0,0,0),Insidesome Text,Outside(0,0,0),Outside(0,0,0),Insidemore Text,Outside(0,0,0)]内部


所以这些#{a,b,c}位应该和其他文本一样变成不同的东西。



我有这样的代码:

  module ParsecTest其中

import Text.ParserCombinators.Parsec
import Monad

type Reference =(Int,Int,Int)

data Transc = Inside String |外部参考
派生(显示)

text ::解析器Transc
text = do
x< - manyTill anyChar((lookAhead reference)< |>( eof>> return(Inside)));
return(Inside x)

transc = reference< |>文本

alot :: Parser [Transc]
alot = do
manyTill tr​​ansc eof

reference :: Parser Transc
reference = try (do {char'#';
char'{';
a< - number;
char',';
b< - number;
char' ,';
c< - number;
char'}';
return(Outside(a,b,c))})

number :: Parser Int
number = do {x < - many1 digit;
return(read x)}

这可按预期工作。你可以在ghci中输入


parseTest alot来测试它。Some Text#{0,0,0} some Text#{0,0 ,0}#{0,0,0}更多文字#{0,0,0}


但我认为这不是不错。

<1>使用 lookAhead 真的对我的问题有必要吗?

<2> 是否返回(Inside)丑陋的黑客?


<3>通常有更简洁/更聪明的方法来实现相同的目标吗?

解决方案

<1>我认为您需要 lookAhead 解析的结果。通过使用 Parser(Transc,Maybe Transc)来指示 Inside >两次运行该解析器会很好,用一个可选的 Outside 。如果表现是一个问题,那么这是值得的。



<2>是。



<3> <$ code> Applicative
s

  number2 :: Parser Int 
number2 = read< ; $> many1 digit

text2 ::解析器Transc
text2 =(内部。)。 (:)
< $> anyChar
< *> manyTill anyChar(try(lookAhead reference2)*> pure()< |> eof)


reference2 :: Parser Transc
reference2 =((Outside。)。 )。 (,,)
< $> (字符串#{*> number2< * char',')
< *> number2
< *> (char','*> number2< * char'}')

transc2 = reference2< |> text2

alot2 =许多transc2

您可能想要重写 reference2 使用像 aux这样的帮手xyz = Outside(x,y,z)



编辑:更改文本以处理不以<>结尾的输入


I try to do this:

Parse a Text in the form:

Some Text #{0,0,0} some Text #{0,0,0}#{0,0,0} more Text #{0,0,0}

into a list of some data structure:

[Inside "Some Text ",Outside (0,0,0),Inside " some Text ",Outside (0,0,0),Outside (0,0,0),Inside " more Text ",Outside (0,0,0)]

So these #{a,b,c}-bits should turn into different things as the rest of the text.

I have this code:

module ParsecTest where

import Text.ParserCombinators.Parsec
import Monad

type Reference = (Int, Int, Int)

data Transc = Inside String | Outside Reference
              deriving (Show)

text :: Parser Transc
text =  do
         x <- manyTill anyChar ((lookAhead reference) <|> (eof >> return (Inside "")));
         return (Inside x)

transc = reference <|> text

alot :: Parser [Transc]
alot = do
        manyTill transc eof

reference :: Parser Transc
reference = try (do{ char '#';
                  char '{';
                  a <- number;
                char ',';
                b <- number;
                char ',';
                c <- number;
                char '}';
                return (Outside (a,b,c)) })

number :: Parser Int
number = do{ x <- many1 digit;
             return (read x) }

This works as expected. You can test this in ghci by typing

parseTest alot "Some Text #{0,0,0} some Text #{0,0,0}#{0,0,0} more Text #{0,0,0}"

But I think it's not nice.

1) Is the use of lookAhead really necessary for my problem?

2) Is the return (Inside "") an ugly hack?

3) Is there generally a more concise/smarter way to archieve the same?

解决方案

1) I think you do need lookAhead as you need the result of that parse. It would be nice to avoid running that parser twice by having a Parser (Transc,Maybe Transc) to indicate an Inside with an optional following Outside. If performance is an issue, then this is worth doing.

2) Yes.

3) Applicatives

number2 :: Parser Int
number2 = read <$> many1 digit

text2 :: Parser Transc
text2 = (Inside .) . (:) 
     <$> anyChar 
     <*> manyTill anyChar (try (lookAhead reference2) *> pure () <|> eof)


reference2 :: Parser Transc
reference2 = ((Outside .) .) . (,,) 
          <$> (string "#{" *> number2 <* char ',') 
          <*> number2 
          <*> (char ',' *> number2 <* char '}')

transc2 = reference2 <|> text2

alot2 = many transc2

You may want to rewrite the beginning of reference2 using a helper like aux x y z = Outside (x,y,z).

EDIT: Changed text to deal with inputs that don't end with an Outside.

这篇关于Parsec-Parser工作正常,但可以做得更好吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆