将字符串拆分为列表,但保留拆分模式 [英] Split string into a list, but keeping the split pattern

查看:42
本文介绍了将字符串拆分为列表,但保留拆分模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前我正在按模式拆分字符串,如下所示:

Currently i am splitting a string by pattern, like this:

outcome_array=the_text.split(pattern_to_split_by)

问题是我分割的模式本身总是被省略.

The problem is that the pattern itself that i split by, always gets omitted.

如何让它包含拆分模式本身?

How do i get it to include the split pattern itself?

推荐答案

感谢 Mark Wilkins 的启发,但这里有一段较短的代码:

Thanks to Mark Wilkins for inpsiration, but here's a shorter bit of code for doing it:

irb(main):015:0> s = "split on the word on okay?"
=> "split on the word on okay?"
irb(main):016:0> b=[]; s.split(/(on)/).each_slice(2) { |s| b << s.join }; b
=> ["split on", " the word on", " okay?"]

或:

s.split(/(on)/).each_slice(2).map(&:join)

请参阅折叠下方的说明.

See below the fold for an explanation.

这是它的工作原理.首先,我们拆分on",但将其括在括号中以使其成为匹配组.当传递给 split 的正则表达式中有匹配组时,Ruby 将在输出中包含该组:

Here's how this works. First, we split on "on", but wrap it in parentheses to make it into a match group. When there's a match group in the regular expression passed to split, Ruby will include that group in the output:

s.split(/(on)/)
# => ["split", "on", "the word", "on", "okay?"

现在我们要将on"的每个实例与前面的字符串连接起来.each_slice(2) 通过一次向其块传递两个元素来提供帮助.让我们调用 each_slice(2) 看看结果.由于 each_slice 在没有块的情况下调用时,将返回一个枚举器,我们将 to_a 应用到 Enumerator,以便我们可以看到 Enumerator 将枚举的内容:

Now we want to join each instance of "on" with the preceding string. each_slice(2) helps by passing two elements at a time to its block. Let's just invoke each_slice(2) to see what results. Since each_slice, when invoked without a block, will return an enumerator, we'll apply to_a to the Enumerator so we can see what the Enumerator will enumerator over:

s.split(/(on)/).each_slice(2).to_a
# => [["split", "on"], ["the word", "on"], ["okay?"]]

我们越来越近了.现在我们所要做的就是将这些词连接在一起.这让我们得到了上面的完整解决方案.我将把它拆开成单独的行,以便更容易理解:

We're getting close. Now all we have to do is join the words together. And that gets us to the full solution above. I'll unwrap it into individual lines to make it easier to follow:

b = []
s.split(/(on)/).each_slice(2) do |s|
  b << s.join
end
b
# => ["split on", "the word on" "okay?"]

但是有一个很好的方法可以消除临时 b 并大大缩短代码:

But there's a nifty way to eliminate the temporary b and shorten the code considerably:

s.split(/(on)/).each_slice(2).map do |a|
  a.join
end

map 将其输入数组的每个元素传递给块;块的结果成为输出数组中该位置的新元素.在 MRI >= 1.8.7 中,您可以将其缩短得更多,相当于:

map passes each element of its input array to the block; the result of the block becomes the new element at that position in the output array. In MRI >= 1.8.7, you can shorten it even more, to the equivalent:

s.split(/(on)/).each_slice(2).map(&:join)

这篇关于将字符串拆分为列表,但保留拆分模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆