如何截断长生不老药中的字符串? [英] How to truncate a string in elixir?

查看:64
本文介绍了如何截断长生不老药中的字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为e剂工作,其想法是:我有一个字符串,其中包含 [a-zA-Z0-9] 单词,并用连字符分隔。像这样:

I'm working with slugs for elixir, the idea is: I have a string with [a-zA-Z0-9] words separated by hyphens. Like:

string = "another-long-string-to-be-truncated-and-much-text-here"

我想确保最大字符串长度等于30,但我也想确保单词在达到最大长度时不会减少一半。所以 string 的前30个符号是另一个将要截断的长字符串,但我想拥有另一个很长的字符串,其单词被截短了,将其完全删除。我该怎么办?

I want to be ensure that max string length equals to 30, but I also want to be sure that words aren't cut by half on reaching maximum length. So that first 30 symbols of string are another-long-string-to-be-trun but I want to have another-long-string-to-be with word truncated to be removed completely. How can I do that?

推荐答案

首先,如果您根本不关心性能,,您可以将所有工作传递给正则表达式:

First of all, if you don't care about performance at all, you can relay all the work to the regex:

〜r / \A(。{0,30})(? :-| \Z)/

我认为这是最短的解决方案,但效率不高:

I assume it will be the shortest solution, but not efficient:

iex(28)> string
"another-long-string-to-be-truncated-and-much-text-here"
iex(29)> string2
"another-long-string-to-be-cool-about-that"

iex(30)> Regex.run(~r/\A(.{0,30})(?:-|\Z)/, string) |> List.last() 
"another-long-string-to-be"

iex(31)> Regex.run(~r/\A(.{0,30})(?:-|\Z)/, string2) |> List.last()
"another-long-string-to-be-cool"



高效的解决方案



但是如果您确实关心性能和内存,那么我建议这样做:

Efficient solution

But if you do care about performance and memory, then I suggest this:

defmodule CoolSlugHelper do
  def slug(input, length \\ 30) do
    length_minus_1 = length - 1

    case input do
      # if the substring ends with "-"
      # i. e. "abc-def-ghi", 8 or "abc-def-", 8 -> "abc-def"
      <<result::binary-size(length_minus_1), "-", _::binary>> -> result

      # if the next char after the substring is "-"
      # i. e. "abc-def-ghi", 7 or "abc-def-", 7 -> "abc-def"
      <<result::binary-size(length), "-", _::binary>> -> result

      # if it is the exact string. i. e. "abc-def", 7 -> "abc-def"
      <<_::binary-size(length)>> -> input

      # return an empty string if we reached the beginnig of the string
      _ when length <= 1 -> ""

      # otherwise look into shorter substring
      _ -> slug(input, length_minus_1)
    end
  end
end

它不会按字符收集结果字符串。而是从所需的长度开始,一直到1为止寻找正确的子字符串。这就是它在内存和速度方面变得有效的方式。

It does not collect the resulting string char-by-char. Instead, it looks for the correct substring starting from the desired length down to 1. That's how it becomes efficient in terms of memory and speed.

我们需要这个 length_minus_1 变量,因为我们不能在 binary-size 二进制模式匹配中使用表达式。

We need this length_minus_1 variable because we cannot use expressions in the binary-size binary pattern matching.

以下是截至2018年12月22日所有拟议解决方案的基准:

Here is the benchmark of all the proposed solutions as of Dec 22nd, 2018:

(简单正则表达式为〜r / \ aboveA(。{0,30})(?:-| \Z)/ 上面的正则表达式)

(Simple Regex is the ~r/\A(.{0,30})(?:-|\Z)/ regex above)

Name                     ips        average  deviation         median         99th %
CoolSlugHelper      352.14 K        2.84 μs  ±1184.93%           2 μs           8 μs
SlugHelper           70.98 K       14.09 μs   ±170.20%          10 μs          87 μs
Simple Regex         33.14 K       30.17 μs   ±942.90%          21 μs         126 μs
Truncation           11.56 K       86.51 μs    ±84.81%          62 μs         299 μs

Comparison: 
CoolSlugHelper      352.14 K
SlugHelper           70.98 K - 4.96x slower
Simple Regex         33.14 K - 10.63x slower
Truncation           11.56 K - 30.46x slower

Memory usage statistics:

Name              Memory usage
CoolSlugHelper         2.30 KB
SlugHelper            12.94 KB - 5.61x memory usage
Simple Regex          20.16 KB - 8.75x memory usage
Truncation            35.36 KB - 15.34x memory usage

这篇关于如何截断长生不老药中的字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆