检查字符串是否由唯一字母组成的最简单方法? [英] Easiest way of checking if a string consists of unique letters?

查看:33
本文介绍了检查字符串是否由唯一字母组成的最简单方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果一个单词由唯一字母组成(不区分大小写),我需要检查 Java.由于直接的解决方案很无聊,我想出了:

I need to check in Java if a word consists of unique letters (case insensitive). As straight solution is boring, I came up with:

  1. 对于字符串中的每个字符,检查是否indexOf(char) == lastIndexOf(char).
  2. 将所有字符添加到 HashSet 并检查是否设置大小 == 字符串长度.
  3. 将字符串转换为字符数组,按字母顺序排序,遍历数组元素并检查是否c[i] == c[i+1].
  1. For every char in a string check if indexOf(char) == lastIndexOf(char).
  2. Add all chars to HashSet and check if set size == string length.
  3. Convert a string to a char array, sort it alphabetically, loop through array elements and check if c[i] == c[i+1].

目前我最喜欢#2,这似乎是最简单的方法.还有其他有趣的解决方案吗?

Currently I like #2 the most, seems like the easiest way. Any other interesting solutions?

推荐答案

我不喜欢 1. -- 这是一个 O(N2) 算法.你的 2. 大致是线性的,但总是遍历整个字符串.你的 3. 是 O(N lg2 N),(可能)一个相对较高的常数——可能几乎总是比 2 慢.

I don't like 1. -- it's an O(N2) algorithm. Your 2. is roughly linear, but always traverses the entire string. Your 3. is O(N lg2 N), with (probably) a relatively high constant -- probably almost always slower than 2.

然而,我的偏好是,当您尝试将一个字母插入到集合中时,检查它是否已经存在,如果存在,您可以立即停止.鉴于字母的随机分布,这应该平均只需要扫描一半的字符串.

My preference, however, would be when you try to insert a letter into the set, check whether it was already present, and if it was, you can stop immediately. Given random distribution of letters, this should require scanning only half the string on average.

这两条评论都是正确的,您希望扫描的字符串的确切部分取决于分布和长度——在某些时候,字符串足够长,重复是不可避免的,并且(例如)一个差一点,机会还是蛮高的.事实上,给定一个平坦的随机分布(即集合中的所有字符的可能性相等),这应该与生日悖论非常吻合,这意味着碰撞的可能性与集合中可能的字符数的平方根有关.字符集.举个例子,如果我们假设基本的 US-ASCII(128 个字符)的概率相等,我们会在 14 个字符左右时达到 50% 的碰撞几率.当然,在真正的字符串中,我们可能会比这更早,因为在大多数字符串中,ASCII 字符的使用频率并不接近相等.

both comments are correct that exactly what portion of the string you expect to scan will depend on the distribution and the length -- at some point the string is long enough that a repeat is inevitable, and (for example) one character short of that, the chance is still pretty darned high. In fact, given a flat random distribution (i.e., all characters in the set are equally likely), this should fit closely with the birthday paradox, meaning the chance of a collision is related to the square root of the number of possible characters in the character set. Just for example, if we assumed basic US-ASCII (128 characters) with equal probability, we'd reach a 50% chance of a collision at around 14 characters. Of course, in real strings we could probably expect it sooner than that, since the ASCII characters aren't used with anywhere close to equal frequency in most strings.

这篇关于检查字符串是否由唯一字母组成的最简单方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆