FParsec:如何在fparsec(新手)中解析日期 [英] FParsec: how to parse date in fparsec (newbie)

查看:110
本文介绍了FParsec:如何在fparsec(新手)中解析日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用比尔·卡萨林(Bill Casarin)上的帖子如何使用fparsec解析定界文件,我在简化逻辑以了解代码的工作原理.我正在将多行定界文档解析为Cell列表列表结构(目前),其中Cell是字符串或浮点数.我是这个方面的新手.

I am using the Bill Casarin post on how to parse delimited files with fparsec, I am dumbing the logic down to get an understanding of how the code works. I am parsing a multi row delimited document into Cell list list structure (for now) where a Cell is a string or a float. I am a complete newbie on this.

我在解析浮点数时遇到问题-在典型情况下(由制表符分隔的单元格,包含数字)可以正常工作.但是,当一个单元格碰巧是一个以数字开头的字符串时,它就会崩溃.

I am having issues parsing the floats - in a typical case (a cell delimitted by tabs, containing a numeric) it works. However when a cell happens to be a string that starts with a number - it falls apart.

我如何修改pFloatCell以解析(尽管通过选项卡的方式)浮点数或什么都没有?

How do I modify pFloatCell to either parse (although the way through the tab) as a float or nothing?

谢谢

type Cell = 
    | String of string 
    | Float of float
.
.
.
let pStringCell delim = 
    manyChars (nonQuotedCellChar delim)
    |>> String

// this is my issue. pfloat parses the string one 
// char at a time, and once it starts off with a number 
// it is down that path, and errors out
let pFloatCell delim = 
    FParsec.CharParsers.pfloat
    |>> Float

let pCell delim = 
    (pFloatCell delim) <|> (pStringCell delim)
.
.
.
let ParseTab s  =
  let delim = "\t"
  let res = run (csv delim) s in
    match res with
     | Success (rows, _, _) -> { IsSuccess = true; ErrorMsg = "Ok"; Result = stripEmpty rows }
     | Failure (s, _, _) -> { IsSuccess = false; ErrorMsg = s; Result = [[]] }
.
.
.
let test() =

    let parsed = ParseTab data

昨晚我迟到了.我打算发布数据.第一个作品

oops late for me last night. I meant to post the data. This first one works

let data = 
    "s10 Mar 2011 18:28:11 GMT\n"

这将返回错误:

let data = 
    "10 Mar 2011 18:28:11 GMT\n"

返回并包含ChaosP的建议:

returns, both with and witout ChaosP's recommendation:

ErrorMsg ="Ln错误:1列: 3 \ r \ n2011年3月10日18:28:11 GMT \ r \ n ^ \ r \ n期望:文件结尾,换行符 或'\ t'\ r \ n"

ErrorMsg = "Error in Ln: 1 Col: 3\r\n10 Mar 2011 18:28:11 GMT\r\n ^\r\nExpecting: end of file, newline or '\t'\r\n"

该尝试似乎工作正常.在第二种情况下,它最多只能抓取10个-而 pfloat的代码只能在第一个空格内查找.我需要建议pfloat,它需要一直查找到下一个选项卡或换行符,而不管其前面是否有空格.通过执行Double.Parse编写自己的pfloat版本-但我宁愿依赖该库.

It looks as though the attempt is working fine. in the second case it is only grabbing up to the 10 - and the code for pfloat looks only up to the first whitespace. I need to convice pfloat that it needs to look all the way up to the next tab or newline regardless of whether there is a space before it; write my own version of pfloat by performing a Double.Parse - but I would rather rely on the library.

推荐答案

由于您要解析的文本似乎有点含糊,因此您需要修改您的pCell解析器.

Since it seems the text you'll be parsing is a bit ambiguous you'll need to modify your pCell parser.

let sep delim =
     skipString delim <|> skipAnyOf "\r\n" <|> eof

let pCell delim = 
    attempt (pFloatCell delim .>> sep delim) <|> (pStringCell delim .>> sep delim)

这还意味着您需要修改使用pCell的任何解析器.

This also means you'll need to modify whichever parser uses pCell.

let pCells delim =
    many pCell delim 

注意

.>>运算符实际上非常简单.像跳蛙算子一样思考它.应用右手侧并忽略结果后,将返回左手侧的值.

The .>> operator is actually quite simple. Think of it like the leap-frog operator. The value of the left hand side is returned after applying the right hand side and ignoring the result.

Parser<'a, 'b> -> Parser<'c, 'b> -> Parser<'a, 'b>

这篇关于FParsec:如何在fparsec(新手)中解析日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆