F#我应该如何考虑序列中的定界? [英] F# How should I think about delimiting items in sequences?

查看:61
本文介绍了F#我应该如何考虑序列中的定界?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个菜鸟问题的道歉.我正在尝试将我的思维模式从程序性转变为功能性.

Apologies for a rookie question. I'm trying to change my mental paradigm from procedural to functional.

例如,假设我有一个要打印的姓名列表,例如"John,Paul,George和Ringo".但是此代码不满足:

For instance, suppose I have a list of names that I want to print like this "John, Paul, George, and Ringo." But this code does not satisfy:

let names = [ "John"; "Paul"; "George"; "Ringo" ]
names |> Seq.iter (fun s -> printf "%s, " s)

我的程序性本能是寻求一种将谓词暗示到该lambda中的方式,以便它可以在,"或以及"或."之间进行分支,具体取决于我们在迭代序列的位置.我认为这是错误的,但是我对正确的事情感到困惑.

My procedural instinct is to seek a way to insinuate a predicate into that lambda so that it can branch between ", " or ", and " or ". " depending upon where we're at iterating the sequence. I think that's wrong, but I'm feeling around for what's right.

将序列分成多个部分会更好吗?

Would it be better to split the sequence in parts?

在这种情况下,我们似乎想将序列拆分为与不同的定界符行为相对应的部分.我们想在最后分割它,所以我们不能使用Seq.但是我们可以改用List.splitAt.

In this case it seems that we want to split the sequence into parts corresponding to distinct delimiter behaviors. We want to split it at the end, so we can't use Seq. But we can use List.splitAt instead.

let start, ending = List.splitAt (names.Length - 1) names
let penultimate, last = List.splitAt 1 ending
start |> Seq.iter (fun s -> printf "%s, " s)
penultimate |> Seq.iter (fun s -> printf "%s, and " s)
last |> Seq.iter (fun s -> printf "%s. " s)

这是正义的做法吗?有没有我忽视的更好的解决方案?我在沿着正确的方向思考吗?

Is this a righteous approach? Is there a better solution I've overlooked? Am I thinking along the right lines?

推荐答案

我要解决这些问题的一般方法是将它们分成较小的部分并单独解决:

The general approach I take to tackle these kind of problems is to split them into smaller parts and solve individually:

  • 一个空列表[]导致""
  • 一个元素["a"]产生"a."
  • 两个元素[ "a"; "b" ]导致"a and b."
  • 更多元素(即a :: rest)产生"a, " + takeCareOf rest,其中takeCareOf遵循上述规则.请注意,我们不需要知道完整列表的长度.
  • an empty list [] results in ""
  • one element ["a"] results in "a."
  • two elements [ "a"; "b" ] result in "a and b."
  • more elements (that is a :: rest) result in "a, " + takeCareOf rest, where takeCareOf follows above rules. Note that we don't need to know the length of the full list.

以上配方可直接翻译为F#(以及一般的功能语言):

Above recipe directly translates to F# (and functional languages in general):

let rec commaAndDot' = function
| [] -> ()
| [ a ] -> printfn "%s." a
| a :: [ b ] -> printfn "%s and %s." a b
| a :: rest -> printf "%s, " a; commaAndDot' rest

我们完成了吗?不,commaAndDot'违反了单一责任原则,因为该函数实现了我们的业务逻辑" 打印到控制台.让我们解决这个问题:

Are we done yet? No, commaAndDot' violates the Single Responsibility Principle because the function implements our 'business logic' and prints to the console. Let's fix that:

let rec commaAndDot'' = function
| [] -> ""
| [ a ] -> sprintf "%s." a
| a :: [ b ] -> sprintf "%s and %s." a b
| a :: rest -> sprintf "%s, " a + commaAndDot'' rest

另一个好处是,我们现在可以并行调用该函数,并且输出不再混淆.

As an additional benefit we can now call the function in parallel and the output does not get mixed up anymore.

我们完成了吗?不,上面的函数不是尾递归的(在将commaAndDot'' rest连接到当前结果之前,我们需要计算commaAndDot'' rest),并且会浪费大量的堆栈供大型列表使用.解决此问题的标准方法是引入累加器acc:

Are we done yet? No, above function is not tail-recursive (we need to compute commaAndDot'' rest before concatenating it to the current result) and would blow the stack for large lists. A standard approach to fixing this is to introduce an accumulator acc:

let  commaAndDot''' words =
    let rec helper acc = function
    | [] -> acc
    | [ a ] -> sprintf "%s%s." acc a
    | a :: [ b ] -> sprintf "%s%s and %s." acc a b
    | a :: rest ->  helper (acc + sprintf "%s, " a) rest
    helper "" words

我们完成了吗?不,commaAndDot'''为中间结果创建很多字符串.由于F#不是纯语言,我们可以利用局部(私有,不可观察的)突变来优化内存和速度:

Are we done yet? No, commaAndDot''' creates a lot of strings for intermediate results. Thanks to F# not being a pure language, we can leverage local (private, non-observable) mutation to optimize for memory and speed:

let  commaAndDot words =
    let sb = System.Text.StringBuilder()
    let rec helper = function
    | [] -> sb
    | [ a ] -> sprintf "%s." a |> sb.Append
    | a :: [ b ] -> sprintf "%s and %s." a b |> sb.Append
    | a :: rest ->
        sprintf "%s, " a |> sb.Append |> ignore
        helper rest
    helper words |> string

我们完成了吗?可能...至少这是我认为惯用的F#并愉快地提交的东西.为了进一步优化(例如分别分隔逗号和点或更改图案的顺序),我会先写微基准,然后再牺牲可读性.

Are we done yet? Probably... at least this is something I would consider idiomatic F# and happily commit. For optimising further (e.g. Appending commas and dots separately or changing the order of the patterns) I'd first write micro-benchmarks before sacrificing readability.

所有版本都会产生相同的输出:

All versions generate the same output:

commaAndDot []                          // ""
commaAndDot [ "foo" ]                   // "foo."
commaAndDot [ "foo"; "bar" ]            // "foo and bar."
commaAndDot [ "Hello"; "World"; "F#" ]  // "Hello, World and F#."

更新:SCNR,创建了一个基准测试...结果显示为HTML代码段(用于漂亮的表格数据).

Update: SCNR, created a benchmark... results are below as a HTML snippet (for nice tabular data).

BuilderOpt是StringBuilder版本,其中[]大小写移至底部, BuilderChained具有链接的Append调用,例如sb.Append(a).Append(" and ").Append(b)和BuilderFormat例如sb.AppendFormat("{0} and {1}", a, b).完整的源代码可用.

BuilderOpt is the StringBuilder version with the [] case moved to the bottom, BuilderChained is with chained Append calls, e.g. sb.Append(a).Append(" and ").Append(b) and BuilderFormat is e.g. sb.AppendFormat("{0} and {1}", a, b). Full source code available.

按预期,对于较小的列表,简单"版本的性能更好,列表越大,BuilderChained越好. Concat的性能比我预期的要好,但不会产生正确的输出(缺少.",缺少一种情况).收率相当慢...

As expected, 'simpler' versions perform better for small lists, the larger the list the better BuilderChained. Concat performs better than I expected but does not produce the right output (missing ".", lacking one case). Yield gets rather slow...

<!DOCTYPE html>
<html lang='en'>
<head>
<meta charset='utf-8' />
<title>Benchmark.CommaAndDot</title>

<style type="text/css">
	table { border-collapse: collapse; display: block; width: 100%; overflow: auto; }
	td, th { padding: 6px 13px; border: 1px solid #ddd; }
	tr { background-color: #fff; border-top: 1px solid #ccc; }
	tr:nth-child(even) { background: #f8f8f8; }
</style>
</head>
<body>
<pre><code>
BenchmarkDotNet=v0.11.1, OS=Windows 10.0.16299.726 (1709/FallCreatorsUpdate/Redstone3)
Intel Core i7 CPU 950 3.07GHz (Nehalem), 1 CPU, 8 logical and 4 physical cores
Frequency=2998521 Hz, Resolution=333.4977 ns, Timer=TSC
  [Host]     : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit LegacyJIT-v4.7.3190.0 DEBUG
  DefaultJob : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3190.0
</code></pre>
<pre><code></code></pre>

<table>
<thead><tr><th>  Method</th><th>Verbosity</th><th>    Mean</th><th>Error</th><th>StdDev</th><th>  Median</th><th>Scaled</th><th>ScaledSD</th>
</tr>
</thead><tbody><tr><td>Concat</td><td>0</td><td>39.905 ns</td><td>0.0592 ns</td><td>0.0494 ns</td><td>39.906 ns</td><td>1.02</td><td>0.11</td>
</tr><tr><td>Yield</td><td>0</td><td>27.235 ns</td><td>0.0772 ns</td><td>0.0603 ns</td><td>27.227 ns</td><td>0.69</td><td>0.07</td>
</tr><tr><td>Accumulator</td><td>0</td><td>1.956 ns</td><td>0.0109 ns</td><td>0.0096 ns</td><td>1.954 ns</td><td>0.05</td><td>0.01</td>
</tr><tr><td>Builder</td><td>0</td><td>32.384 ns</td><td>0.2986 ns</td><td>0.2331 ns</td><td>32.317 ns</td><td>0.82</td><td>0.09</td>
</tr><tr><td>BuilderOpt</td><td>0</td><td>33.664 ns</td><td>1.0371 ns</td><td>0.9194 ns</td><td>33.402 ns</td><td>0.86</td><td>0.09</td>
</tr><tr><td>BuilderChained</td><td>0</td><td>39.671 ns</td><td>1.2097 ns</td><td>3.5669 ns</td><td>41.339 ns</td><td>1.00</td><td>0.00</td>
</tr><tr><td>BuilderFormat</td><td>0</td><td>40.276 ns</td><td>0.8909 ns</td><td>1.8792 ns</td><td>39.494 ns</td><td>1.02</td><td>0.12</td>
</tr><tr><td>Concat</td><td>1</td><td>153.116 ns</td><td>1.1592 ns</td><td>0.9050 ns</td><td>152.706 ns</td><td>0.87</td><td>0.01</td>
</tr><tr><td>Yield</td><td>1</td><td>154.522 ns</td><td>0.2890 ns</td><td>0.2256 ns</td><td>154.479 ns</td><td>0.88</td><td>0.00</td>
</tr><tr><td>Accumulator</td><td>1</td><td>223.342 ns</td><td>0.3678 ns</td><td>0.2872 ns</td><td>223.412 ns</td><td>1.27</td><td>0.00</td>
</tr><tr><td>Builder</td><td>1</td><td>232.194 ns</td><td>0.2951 ns</td><td>0.2465 ns</td><td>232.265 ns</td><td>1.32</td><td>0.00</td>
</tr><tr><td>BuilderOpt</td><td>1</td><td>232.016 ns</td><td>0.5654 ns</td><td>0.4722 ns</td><td>232.170 ns</td><td>1.31</td><td>0.00</td>
</tr><tr><td>BuilderChained</td><td>1</td><td>176.473 ns</td><td>0.3918 ns</td><td>0.3272 ns</td><td>176.341 ns</td><td>1.00</td><td>0.00</td>
</tr><tr><td>BuilderFormat</td><td>1</td><td>219.262 ns</td><td>6.7995 ns</td><td>6.3603 ns</td><td>217.003 ns</td><td>1.24</td><td>0.03</td>
</tr><tr><td>Concat</td><td>10</td><td>1,284.042 ns</td><td>1.7035 ns</td><td>1.4225 ns</td><td>1,283.443 ns</td><td>1.68</td><td>0.05</td>
</tr><tr><td>Yield</td><td>10</td><td>6,532.667 ns</td><td>12.6169 ns</td><td>10.5357 ns</td><td>6,533.504 ns</td><td>8.55</td><td>0.24</td>
</tr><tr><td>Accumulator</td><td>10</td><td>2,701.483 ns</td><td>4.8509 ns</td><td>4.5376 ns</td><td>2,700.208 ns</td><td>3.54</td><td>0.10</td>
</tr><tr><td>Builder</td><td>10</td><td>1,865.668 ns</td><td>5.0275 ns</td><td>3.9252 ns</td><td>1,866.920 ns</td><td>2.44</td><td>0.07</td>
</tr><tr><td>BuilderOpt</td><td>10</td><td>1,820.402 ns</td><td>2.7853 ns</td><td>2.3258 ns</td><td>1,820.464 ns</td><td>2.38</td><td>0.07</td>
</tr><tr><td>BuilderChained</td><td>10</td><td>764.334 ns</td><td>19.8528 ns</td><td>23.6334 ns</td><td>756.988 ns</td><td>1.00</td><td>0.00</td>
</tr><tr><td>BuilderFormat</td><td>10</td><td>1,177.186 ns</td><td>1.9584 ns</td><td>1.6354 ns</td><td>1,177.897 ns</td><td>1.54</td><td>0.04</td>
</tr><tr><td>Concat</td><td>100</td><td>25,579.773 ns</td><td>824.1504 ns</td><td>688.2028 ns</td><td>25,288.873 ns</td><td>5.33</td><td>0.14</td>
</tr><tr><td>Yield</td><td>100</td><td>421,872.560 ns</td><td>902.5023 ns</td><td>753.6302 ns</td><td>421,782.071 ns</td><td>87.87</td><td>0.23</td>
</tr><tr><td>Accumulator</td><td>100</td><td>80,579.168 ns</td><td>227.7392 ns</td><td>177.8038 ns</td><td>80,547.868 ns</td><td>16.78</td><td>0.05</td>
</tr><tr><td>Builder</td><td>100</td><td>15,047.790 ns</td><td>26.2248 ns</td><td>21.8989 ns</td><td>15,048.903 ns</td><td>3.13</td><td>0.01</td>
</tr><tr><td>BuilderOpt</td><td>100</td><td>15,287.117 ns</td><td>39.8679 ns</td><td>31.1262 ns</td><td>15,293.739 ns</td><td>3.18</td><td>0.01</td>
</tr><tr><td>BuilderChained</td><td>100</td><td>4,800.966 ns</td><td>11.3614 ns</td><td>10.0716 ns</td><td>4,801.450 ns</td><td>1.00</td><td>0.00</td>
</tr><tr><td>BuilderFormat</td><td>100</td><td>8,382.896 ns</td><td>87.8963 ns</td><td>68.6236 ns</td><td>8,368.400 ns</td><td>1.75</td><td>0.01</td>
</tr></tbody></table>
</body>
</html>

这篇关于F#我应该如何考虑序列中的定界?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆