具有中间状态的列表处理 [英] List processing with intermediate state
问题描述
我正在处理一个字符串列表,您可以将它们视为书中的几行. 当一行为空时,必须将其丢弃.当它是标题时,它被保存"为当前标题.每条常规"行必须生成一个带有其文本和当前标题的对象. 最后,您将获得一个行列表,每行都有其相应的标题.
I am processing a list of strings, you can think of them as lines of a book. When a line is empty it must be discarded. When it is a title, it is "saved" as the current title. Every "normal" line must generate an object with its text and the current title. In the end you have a list of lines, each with its corresponding title.
例如:
- Chapter 1
Lorem ipsum dolor sit amet
consectetur adipisicing elit
- Chapter 2
sed do eiusmod tempor
incididunt u
第一行是标题,第二行必须丢弃,然后将两行保留为段落,每行以第1章"作为标题.等等.您最终得到一个类似于以下内容的收藏集:
First line is a title, second line must be discarded, then two lines are kept as paragraphs, each with "Chapter 1" as title. And so on. You end up with a collection similar to:
{"Lorem ipsum...", "Chapter 1"},
{"consectetur...", "Chapter 1"},
{"sed do...", "Chapter 2"},
{"incididunt ...", "Chater 2"}
我知道标题/段落模型没有100%的意义,但我简化了模型以说明问题.
I know the title/paragraph model doesn't make 100% sense, but I simplified the model to illustrate the problem.
这是我的迭代解决方案:
This is my iterative solution:
let parseText allLines =
let mutable currentTitle = String.Empty
seq {
for line in allLines do
match parseLine line with
| Empty -> 0 |> ignore
| Title caption ->
currentTitle <- caption
| Body text ->
yield new Paragraph(currentTitle, text)
}
第一个问题是我必须丢弃空行,我使用0 |> ignore
进行了处理,但对我来说似乎很糟糕.进行此操作(不预先过滤列表)的合适方法是什么?
First issue is I have to discard empty lines, I do it with 0 |> ignore
but it looks quite bad to me. What is the proper to do this (without pre-filtering the list)?
此函数的尾递归版本很简单:
A tail-recursive version of this function is straightforward:
let rec parseText allLines currentTitle paragraphs =
match allLines with
| [] -> paragraphs
| head :: tail ->
match head with
| Empty -> parseText tail currentTitle paragraphs
| Title caption -> parseText tail caption paragraphs
| Body text -> parseText tail currentTitle (new Paragraph(currentTitle, text) :: tail)
问题:
- 两个版本之间是否有显着差异 (样式/性能/等)?
- 是否有更好的方法来解决这个问题 问题?可以用一个List.map做到吗?
- Is there a significant difference between the two versions (style/performance/etc)?
- Is there a better approach to solve this problem? Is it possible to do it with a single List.map?
推荐答案
虽然不是单个List.Map,但这是我想出的解决方案:
Although not a single List.Map, here is the solution I came up with:
let parseText allLines =
allLines
|> Seq.fold (fun (currentTitle,paragraphs) line ->
match parseLine line with
| Empty -> currentTitle,paragraphs
| Title caption -> caption,paragraphs
| Body text -> String.Empty,Paragraph(currentTitle, text)::paragraphs
) (String.Empty,[])
|> snd
我正在使用以(currentTitle,paragraphs)
作为状态的折叠. snd
用于提取结果(它是状态元组的 s eco nd 部分).
I'm using a fold with (currentTitle,paragraphs)
as state. snd
is used to extract the result (It is the second part of the state tuple).
在F#中执行大多数处理时,使用列表非常诱人,但是使用其他数据结构,甚至是普通序列也有其用途.
When you do most of your processing in F#, using lists is very tempting, but other data structures, even plain sequences have their uses.
顺便说一句,您的序列代码可以编译吗?我不得不用currentTitle = ref String.Empty
替换mutable currentTitle = String.Empty
.
BTW, your sequence code does compile? I had to replace mutable currentTitle = String.Empty
with currentTitle = ref String.Empty
.
这篇关于具有中间状态的列表处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!