Smalltalk中子字符串的索引 [英] Indices of a substring in Smalltalk

查看:125
本文介绍了Smalltalk中子字符串的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

似乎Smalltalk实现缺少一种算法,该算法返回String中子字符串的所有索引.最相似的元素仅返回元素的一个索引,例如:firstIndexesOf:in:,findSubstring:,findAnySubstring:variant.

It seems Smalltalk implementations misses an algorithm which return all the indices of a substring in a String. The most similar ones returns only one index of an element, for example : firstIndexesOf:in: , findSubstring:, findAnySubstring: variants.

在Ruby中有实现,但是第一个依靠Ruby hack,第二个依靠忽略重叠的Strings无效,最后一个使用Enumerator类,我不知道该类如何转换为Smalltalk.我想知道 Python实现是否是开始的最佳途径,因为考虑了两种情况(是否重叠)并且不使用正则表达式

There are implementations in Ruby but the first one relies on a Ruby hack, the second one does not work ignoring overlapping Strings and the last one uses an Enumerator class which I don't know how to translate to Smalltalk. I wonder if this Python implementation is the best path to start since considers both cases, overlapping or not and does not uses regular expressions.

我的目标是找到提供以下行为的程序包或方法:

My goal is to find a package or method which provides the following behavior:

'ABDCDEFBDAC' indicesOf: 'BD'. "#(2 8)"

考虑重叠时:

'nnnn' indicesOf: 'nn' overlapping: true. "#(0 2)"

不考虑重叠时:

'nnnn' indicesOf 'nn' overlapping: false. "#(0 1 2)"

在Pharo中,当在Playground中选择了文本时,扫描仪会检测到该子字符串并突出显示匹配项.但是我找不到这个的String实现.

In Pharo, when a text is selected in a Playground, a scanner detects the substring and highlights matches. However I couldn't find a String implementation of this.

到目前为止,我的最大努力是在String(Pharo 6)中实现了该实现:

My best effort so far results in this implementation in String (Pharo 6):

indicesOfSubstring: subString
  | indices i |

  indices := OrderedCollection new: self size.
  i := 0.
  [ (i := self findString: subString startingAt: i + 1) > 0 ] whileTrue: [
    indices addLast: i ].
  ^ indices

推荐答案

首先让我澄清一下Smalltalk集合是基于1的,而不是基于0的.因此,您的示例应阅读

Let me firstly clarify that Smalltalk collections are 1-based, not 0-based. Therefore your examples should read

'nnnn' indexesOf: 'nn' overlapping: false. "#(1 3)"
'nnnn' indexesOf: 'nn' overlapping: true. "#(1 2 3)"

请注意,我也注意到了@lurker的观察(并且也对选择器进行了调整).

Note that I've also taken notice of @lurker's observation (and have tweaked the selector too).

现在,从您的代码开始,我将对其进行如下更改:

Now, starting from your code I would change it as follows:

indexesOfSubstring: subString overlapping: aBoolean
  | n indexes i |
  n := subString size.
  indexes := OrderedCollection new.                            "removed the size"
  i := 1.                                                      "1-based"
  [
    i := self findString: subString startingAt: i.             "split condition"
    i > 0]
  whileTrue: [
    indexes add: i.                                            "add: = addLast:"
    i := aBoolean ifTrue: [i + 1] ifFalse: [i + n]].           "new!"
  ^indexes

确保您编写了一些单元测试(不要忘记练习边框!)

Make sure you write some few unit tests (and don't forget to exercise the border cases!)

这篇关于Smalltalk中子字符串的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆