F#,FParsec和递归调用流解析器,第二步 [英] F#, FParsec, and Calling a Stream Parser Recursively, Second Take

查看:56
本文介绍了F#,FParsec和递归调用流解析器,第二步的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

感谢您对我的第一篇帖子的回复和我在该项目上的第二篇文章.这个问题基本上与第一个问题相同,但是我的代码根据关于这两个问题的反馈进行了更新.如何递归调用解析器?

Thank you for the replies to my first post and my second post on this project. This question is basically the same question as the first, but with my code updated according to the feedback received on those two questions. How do I call my parser recursively?

我挠头,茫然地盯着代码.我不知道从这里去哪里.那就是我转向stackoverflow的时候.

I'm scratching my head and staring blankly at the code. I've no idea where to go from here. That's when I turn to stackoverflow.

我已在代码注释中包含了我收到的编译时错误.我受歧视的工会可能是绊脚石.我与受歧视的工会合作不多,所以我可能使用了不正确的工会.

I've included in code comments the compile-time errors I'm receiving. One stumbling block may be my discriminated union. I've not worked with discriminated unions much, so I may be using mine incorrectly.

我正在使用的示例POST(在我之前的两个问题中已包括的部分)由一个边界组成,该边界包括第二个具有新边界的帖子.该第二个职位包括由第二个边界分隔的几个其他部分.其他几个部分都是由标题和XML组成的新帖子.

The example POST I'm working with, bits of which I've included in my previous two questions, consists of one boundary that includes a second post with a new boundary. That second post includes several additional parts separated by the second boundary. Each of those several additional parts is a new post consisting of headers and XML.

我在这个项目中的目标是建立一个库,供我们的C#解决方案使用,该库获取流并递归地将解析后的POST返回到标头和部件中.我真的希望F#在这里闪耀.

My goal in this project is to build a library to be used in our C# solution, with the library taking a stream and returning the POST parsed into headers and parts recursively. I really want F# to shine here.

namespace MultipartMIMEParser

open FParsec
open System.IO

type Header = { name  : string
              ; value : string
              ; addl  : (string * string) list option }

type Content = Content of string
             | Post of Post list
and Post = { headers : Header list
           ; content : Content }

type UserState = { Boundary : string }
  with static member Default = { Boundary="" }

module internal P =
  let ($) f x = f x
  let undefined = failwith "Undefined."
  let ascii = System.Text.Encoding.ASCII
  let str cs = System.String.Concat (cs:char list)

  let makeHeader ((n,v),nvps) = { name=n; value=v; addl=nvps}

  let runP p s = match runParserOnStream p UserState.Default "" s ascii with
                 | Success (r,_,_) -> r
                 | Failure (e,_,_) -> failwith (sprintf "%A" e)

  let blankField = parray 2 newline

  let delimited d e =
      let pEnd = preturn () .>> e
      let part = spaces
                 >>. (manyTill
                      $ noneOf d
                      $ (attempt (preturn () .>> pstring d)
                                  <|> pEnd)) |>> str
       in part .>>. part

  let delimited3 firstDelimiter secondDelimiter thirdDelimiter endMarker =
      delimited firstDelimiter endMarker
      .>>. opt (many (delimited secondDelimiter endMarker
                      >>. delimited thirdDelimiter endMarker))

  let isBoundary ((n:string),_) = n.ToLower() = "boundary"

  let pHeader =
      let includesBoundary (h:Header) = match h.addl with
                                        | Some xs -> xs |> List.exists isBoundary
                                        | None    -> false
      let setBoundary b = { Boundary=b }
       in delimited3 ":" ";" "=" blankField
          |>> makeHeader
          >>= fun header stream -> if includesBoundary header
                                   then
                                     stream.UserState <- setBoundary (header.addl.Value
                                                                      |> List.find isBoundary
                                                                      |> snd)
                                     Reply ()
                                   else Reply ()

  let pHeaders = manyTill pHeader $ attempt (preturn () .>> blankField)

  let rec pContent (stream:CharStream<UserState>) =
      match stream.UserState.Boundary with
      | "" -> // Content is text.
              let nl = System.Environment.NewLine
              let unlines (ss:string list) = System.String.Join (nl,ss)
              let line = restOfLine false
              let lines = manyTill line $ attempt (preturn () .>> blankField)
               in pipe2 pHeaders lines
                        $ fun h c -> { headers=h
                                     ; content=Content $ unlines c }
      | _  -> // Content contains boundaries.
              let b = "--" + stream.UserState.Boundary
              // VS complains about pContent in the following line: 
              // Type mismatch. Expecting a
              //    Parser<'a,UserState>
              // but given a
              //    CharStream<UserState> -> Parser<Post,UserState>
              // The type 'Reply<'a>' does not match the type 'Parser<Post,UserState>'
              let p = pipe2 pHeaders pContent $ fun h c -> { headers=h; content=c }
               in skipString b
                  >>. manyTill p (attempt (preturn () .>> blankField))
                  // VS complains about Content.Post in the following line: 
                  // Type mismatch. Expecting a
                  //     Post list -> Post
                  // but given a
                  //     Post list -> Content
                  // The type 'Post' does not match the type 'Content'
                  |>> Content.Post

  // VS complains about pContent in the following line: 
  // Type mismatch. Expecting a
  //    Parser<'a,UserState>    
  // but given a
  //    CharStream<UserState> -> Parser<Post,UserState>
  // The type 'Reply<'a>' does not match the type 'Parser<Post,UserState>'
  let pStream = runP (pipe2 pHeaders pContent $ fun h c -> { headers=h; content=c })


type MParser (s:Stream) =
  let r = P.pStream s

  let findHeader name =
    match r.headers |> List.tryFind (fun h -> h.name.ToLower() = name) with
    | Some h -> h.value
    | None   -> ""

  member p.Boundary =
    let header = r.headers
                 |> List.tryFind (fun h -> match h.addl with
                                           | Some xs -> xs |> List.exists P.isBoundary
                                           | None    -> false)
     in match header with
        | Some h -> h.addl.Value |> List.find P.isBoundary |> snd
        | None   -> ""
  member p.ContentID = findHeader "content-id"
  member p.ContentLocation = findHeader "content-location"
  member p.ContentSubtype = findHeader "type"
  member p.ContentTransferEncoding = findHeader "content-transfer-encoding"
  member p.ContentType = findHeader "content-type"
  member p.Content = r.content
  member p.Headers = r.headers
  member p.MessageID = findHeader "message-id"
  member p.MimeVersion = findHeader "mime-version"

编辑

为响应到目前为止我收到的反馈(谢谢!),我进行了以下调整,并收到带注释的错误:

In response to the feedback I've received thus far (thank you!), I made the following adjustments, receiving the errors annotated:

let rec pContent (stream:CharStream<UserState>) =
    match stream.UserState.Boundary with
    | "" -> // Content is text.
            let nl = System.Environment.NewLine
            let unlines (ss:string list) = System.String.Join (nl,ss)
            let line = restOfLine false
            let lines = manyTill line $ attempt (preturn () .>> blankField)
             in pipe2 pHeaders lines
                      $ fun h c -> { headers=h
                                   ; content=Content $ unlines c }
    | _  -> // Content contains boundaries.
            let b = "--" + stream.UserState.Boundary
            // The following complaint is about `pContent stream`:
            // This expression was expected to have type
            //     Reply<'a>    
            // but here has type
            //     Parser<Post,UserState>
            let p = pipe2 pHeaders (fun stream -> pContent stream) $ fun h c -> { headers=h; content=c }
             in skipString b
                >>. manyTill p (attempt (preturn () .>> blankField))
                // VS complains about the line above:
                // Type mismatch. Expecting a
                //     Parser<Post,UserState>    
                // but given a
                //     Parser<'a list,UserState>    
                // The type 'Post' does not match the type ''a list'

// See above complaint about `pContent stream`. Same complaint here.
let pStream = runP (pipe2 pHeaders (fun stream -> pContent stream) $ fun h c -> { headers=h; content=c })

我尝试抛出Reply (),但是它们只是返回了解析器,这意味着上面的c变成了Parser<...>而不是Content.那似乎是倒退了一步,或者至少是在错误的方向上.我承认我的无知,欢迎改正!

I tried throwing in Reply ()s, but they just returned parsers, meaning c above became a Parser<...> rather than Content. That seemed to have been a step backwards, or at least in the wrong direction. I admit my ignorance, though, and welcome correction!

推荐答案

我的第一个答案是完全错误的,但我想我应该放弃.

My first answer was completely wrong, but I'd thought I'd leave it up.

类型PostContent定义为:

type Content =
    | Content of string
    | Post of Post list
and Post =
    { headers : Header list
    ; content : Content }

Post是一个记录,而Content是一个有区别的联合.

Post is a Record, and Content is a Discriminated Union.

F#将歧视工会的情况视为与类型分开的命名空间.因此ContentContent.Content不同,并且PostContent.Post不同.因为它们不同,所以使用相同的标识符会造成混淆.

F# treats the cases for Discriminated Unions as a separate namespace from types. So Content is different from Content.Content, and Post is different from Content.Post. Because they are different, having the same identifier is confusing.

pContent应该返回什么?如果应该返回被歧视联盟Content,则需要包装在Content.Post情况下的第一种情况下返回的Post记录,即

What is pContent supposed to be returning? If it's supposed to be returning the Discriminated Union Content, you need to wrap the Post record you are returning in the first case in the Content.Post case i.e.

$ fun h c -> Post [ { headers=h
                    ; content=Content $ unlines c } ]

(F#可以推断出'Post'是指Content.Post大小写,而不是这里的Post记录类型.)

(F# is able to infer that 'Post' refers to Content.Post case, instead of the Post record type here.)

这篇关于F#,FParsec和递归调用流解析器,第二步的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆