在 F# 中对一系列相等的字符进行切片/分组 [英] Slice/Group a sequence of equal chars in F#

查看:18
本文介绍了在 F# 中对一系列相等的字符进行切片/分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要提取文本中相同字符的序列.

I need to extract the sequence of equal chars in a text.

例如:字符串 "aaaBbbcccccccDaBBBzcc11211" 应该转换为字符串列表,如["aaa";"B";"bb";"ccccccc";"D";"a";"BBB";"z";"cc";"11";"2";"11"].

For example: The string "aaaBbbcccccccDaBBBzcc11211" should be converted to a list of strings like ["aaa";"B";"bb";"ccccccc";"D";"a";"BBB";"z";"cc";"11";"2";"11"].

到目前为止,这是我的解决方案:

That's my solution until now:

let groupSequences (text:string) = 

    let toString chars =
        System.String(chars |> Array.ofList)

    let rec groupSequencesRecursive acc chars = seq {
        match (acc, chars) with
        | [], c :: rest -> 
            yield! groupSequencesRecursive [c] rest
        | _, c :: rest when acc.[0] <> c -> 
            yield (toString acc)
            yield! groupSequencesRecursive [c] rest
        | _, c :: rest when acc.[0] = c -> 
            yield! groupSequencesRecursive (c :: acc) rest
        | _, [] -> 
            yield (toString acc)
        | _ -> 
            yield ""
    }

    text
    |> List.ofSeq
    |> groupSequencesRecursive []

groupSequences "aaaBbbcccccccDaBBBzcc11211"
|> Seq.iter (fun x -> printfn "%s" x)
|> ignore

我是 F# 新手.

这个解决方案会更好吗?

This solution can be better?

推荐答案

这里是一个完全通用的实现:

Here a completely generic implementation:

let group xs =
    let folder x = function
        | [] -> [[x]]
        | (h::t)::ta when h = x -> (x::h::t)::ta
        | acc -> [x]::acc
    Seq.foldBack folder xs []

这个函数的类型是seq<'a>->'a 时的列表列表:相等,因此不仅适用于字符串,还适用于任何(有限)元素序列,只要元素类型支持相等比较即可.

This function has the type seq<'a> -> 'a list list when 'a : equality, so works not only on strings, but on any (finite) sequence of elements, as long as the element type supports equality comparison.

与 OP 中的输入字符串一起使用,返回值不是完全预期的形状:

Used with the input string in the OP, the return value isn't quite in the expected shape:

> group "aaaBbbcccccccDaBBBzcc11211";;
val it : char list list =
  [['a'; 'a'; 'a']; ['B']; ['b'; 'b']; ['c'; 'c'; 'c'; 'c'; 'c'; 'c'; 'c'];
   ['D']; ['a']; ['B'; 'B'; 'B']; ['z']; ['c'; 'c']; ['1'; '1']; ['2'];
   ['1'; '1']]

不是string list,返回值是char list list.您可以使用 map 轻松将其转换为字符串列表:

Instead of a string list, the return value is a char list list. You can easily convert it to a list of strings using a map:

> group "aaaBbbcccccccDaBBBzcc11211" |> List.map (List.toArray >> System.String);;
val it : System.String list =
  ["aaa"; "B"; "bb"; "ccccccc"; "D"; "a"; "BBB"; "z"; "cc"; "11"; "2"; "11"]

这利用了将 char[] 作为输入的 String 构造函数重载.

This takes advantage of the String constructor overload that takes a char[] as input.

正如最初所说,这个实现是通用的,所以也可以与其他类型的列表一起使用;例如整数:

As initially stated, this implementation is generic, so can also be used with other types of lists; e.g. integers:

> group [1;1;2;2;2;3;4;4;3;3;3;0];;
val it : int list list = [[1; 1]; [2; 2; 2]; [3]; [4; 4]; [3; 3; 3]; [0]]

这篇关于在 F# 中对一系列相等的字符进行切片/分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆