在parsec中的单引号字符串中解析单个qoute字符 [英] Parsing single qoute char in a single-quoted string in parsec
问题描述
我的parsec解析器中有一个愚蠢的情况,我希望得到您的帮助。
我需要解析一系列由|分隔的强/字符。字符。
因此,我们可以有一个| b |'c'|'abcd'
这应该变成
[a,b,c,abcd]
空格不允许,除非在''字符串中。现在,在我天真的尝试中,我现在已经得到了可以将字符串解析为[a'a,bb]而不是aa |'b'b'的字符串[aa,b'b] 。
singleQuotedChar :: Parser Char
singleQuotedChar = noneOf'< |> try(string''>> return'\'')
simpleLabel = do
whiteSpace haskelldef
lab< - many1(noneOf|)
return $ lab
quotedLabel = do
whiteSpace haskelldef
char'\''
lab< - many singleQuotedChar
char' \''
return $ lab
现在,我该如何告诉解析器考虑如果它后面跟着一个|,那么'停止'或空白?
(或者,得到一些'字符计入此)。输入是用户生成的,所以我不能依赖他们\\-chars。
quotedLabel = do - 读取第一个报价。
whiteSpace
char'\''
quotedLabel2
quotedLabel2 = do - 读取字符串和结束报价。
lab < - 许多singleQuotedChar
尝试(做更多< - quotedLabel3
return $ lttracequotedLabel2(lab ++ more))
< |> (do char'\''
return $ lttracequotedLabel2lab)
quotedLabel3 = do - 处理中间引号
char'\''
lookAhead $ noneOf ['|']
ret< - quotedLabel2
return $ lttracequotedLabel3$'++ ret
I've got a silly situation in my parsec parsers that I would like your help on.
I need to parse a sequence of strongs / chars that are separated by | characters. So, we could have a|b|'c'|'abcd'
which should be turned into
[a,b,c,abcd]
Space is not allowed, unless inside of a ' ' string. Now, in my naïve attempt, I got the situation now where I can parse strings like a'a|'bb' to [a'a,bb] but not aa|'b'b' to [aa,b'b].
singleQuotedChar :: Parser Char
singleQuotedChar = noneOf "'" <|> try (string "''" >> return '\'')
simpleLabel = do
whiteSpace haskelldef
lab <- many1 (noneOf "|")
return $ lab
quotedLabel = do
whiteSpace haskelldef
char '\''
lab <- many singleQuotedChar
char '\''
return $ lab
Now, how do I tell the parser to consider ' a stoping ' iff it is followed by a | or white space? (Or, get some ' char counting into this). The input is user generated, so I cannot rely on them \'-ing chars.
Note that allowing a quote in the middle of a string delimited by quotes is very confusing to read, but I believe this should allow you to parse it.
quotedLabel = do -- reads the first quote.
whiteSpace
char '\''
quotedLabel2
quotedLabel2 = do -- reads the string and the finishing quote.
lab <- many singleQuotedChar
try (do more <- quotedLabel3
return $ lttrace "quotedLabel2" (lab ++ more))
<|> (do char '\''
return $ lttrace "quotedLabel2" lab)
quotedLabel3 = do -- handle middle quotes
char '\''
lookAhead $ noneOf ['|']
ret <- quotedLabel2
return $ lttrace "quotedLabel3" $ "'" ++ ret
这篇关于在parsec中的单引号字符串中解析单个qoute字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!