Parsec-Parser工作正常,但可以做得更好吗? [英] Parsec-Parser works alright, but could it be done better?
问题描述
以下面的形式解析文本:
一些文本#{0,0,0}一些文本#{0,0,0}#{0,0,0}更多文本#{0,0,0}
放入一些数据结构列表中:
$ b
[InsideSome Text ,Outside(0,0,0),Insidesome Text,Outside(0,0,0),Outside(0,0,0),Insidemore Text,Outside(0,0,0)]内部
所以这些#{a,b,c}位应该和其他文本一样变成不同的东西。
我有这样的代码:
module ParsecTest其中
import Text.ParserCombinators.Parsec
import Monad
type Reference =(Int,Int,Int)
data Transc = Inside String |外部参考
派生(显示)
text ::解析器Transc
text = do
x< - manyTill anyChar((lookAhead reference)< |>( eof>> return(Inside)));
return(Inside x)
transc = reference< |>文本
alot :: Parser [Transc]
alot = do
manyTill transc eof
reference :: Parser Transc
reference = try (do {char'#';
char'{';
a< - number;
char',';
b< - number;
char' ,';
c< - number;
char'}';
return(Outside(a,b,c))})
number :: Parser Int
number = do {x < - many1 digit;
return(read x)}
这可按预期工作。你可以在ghci中输入
parseTest alot来测试它。Some Text#{0,0,0} some Text#{0,0 ,0}#{0,0,0}更多文字#{0,0,0}
但我认为这不是不错。
<1>使用 lookAhead
真的对我的问题有必要吗?
<2> 是否返回(Inside)
丑陋的黑客?
<3>通常有更简洁/更聪明的方法来实现相同的目标吗?
<1>我认为您需要 lookAhead
解析的结果。通过使用 Parser(Transc,Maybe Transc)
来指示 Inside >两次运行该解析器会很好,用一个可选的
Outside
。如果表现是一个问题,那么这是值得的。
<2>是。
<3> <$ code> Applicative s
number2 :: Parser Int
number2 = read< ; $> many1 digit
text2 ::解析器Transc
text2 =(内部。)。 (:)
< $> anyChar
< *> manyTill anyChar(try(lookAhead reference2)*> pure()< |> eof)
reference2 :: Parser Transc
reference2 =((Outside。)。 )。 (,,)
< $> (字符串#{*> number2< * char',')
< *> number2
< *> (char','*> number2< * char'}')
transc2 = reference2< |> text2
alot2 =许多transc2
您可能想要重写 reference2
使用像 aux这样的帮手xyz = Outside(x,y,z)
。
编辑:更改文本
以处理不以<
I try to do this:
Parse a Text in the form:
Some Text #{0,0,0} some Text #{0,0,0}#{0,0,0} more Text #{0,0,0}
into a list of some data structure:
[Inside "Some Text ",Outside (0,0,0),Inside " some Text ",Outside (0,0,0),Outside (0,0,0),Inside " more Text ",Outside (0,0,0)]
So these #{a,b,c}-bits should turn into different things as the rest of the text.
I have this code:
module ParsecTest where
import Text.ParserCombinators.Parsec
import Monad
type Reference = (Int, Int, Int)
data Transc = Inside String | Outside Reference
deriving (Show)
text :: Parser Transc
text = do
x <- manyTill anyChar ((lookAhead reference) <|> (eof >> return (Inside "")));
return (Inside x)
transc = reference <|> text
alot :: Parser [Transc]
alot = do
manyTill transc eof
reference :: Parser Transc
reference = try (do{ char '#';
char '{';
a <- number;
char ',';
b <- number;
char ',';
c <- number;
char '}';
return (Outside (a,b,c)) })
number :: Parser Int
number = do{ x <- many1 digit;
return (read x) }
This works as expected. You can test this in ghci by typing
parseTest alot "Some Text #{0,0,0} some Text #{0,0,0}#{0,0,0} more Text #{0,0,0}"
But I think it's not nice.
1) Is the use of lookAhead
really necessary for my problem?
2) Is the return (Inside "")
an ugly hack?
3) Is there generally a more concise/smarter way to archieve the same?
1) I think you do need lookAhead
as you need the result of that parse. It would be nice to avoid running that parser twice by having a Parser (Transc,Maybe Transc)
to indicate an Inside
with an optional following Outside
. If performance is an issue, then this is worth doing.
2) Yes.
3) Applicative
s
number2 :: Parser Int
number2 = read <$> many1 digit
text2 :: Parser Transc
text2 = (Inside .) . (:)
<$> anyChar
<*> manyTill anyChar (try (lookAhead reference2) *> pure () <|> eof)
reference2 :: Parser Transc
reference2 = ((Outside .) .) . (,,)
<$> (string "#{" *> number2 <* char ',')
<*> number2
<*> (char ',' *> number2 <* char '}')
transc2 = reference2 <|> text2
alot2 = many transc2
You may want to rewrite the beginning of reference2
using a helper like aux x y z = Outside (x,y,z)
.
EDIT: Changed text
to deal with inputs that don't end with an Outside
.
这篇关于Parsec-Parser工作正常,但可以做得更好吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!