相比同等长度的字符串,并指出其中发生分歧 [英] Comparing strings of equal lengths and noting where the differences occur

查看:192
本文介绍了相比同等长度的字符串,并指出其中发生分歧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定两个串等长的,使得

Given two strings of equal length such that

s1 = "ACCT"
s2 = "ATCT"

我想找出有串不同的位置。所以,我已经做到了这一点。 (请建议做一个更好的方式。我敢打赌,应该有)

I would like to find out the positions where there strings differ. So i have done this. (please suggest a better way of doing it. I bet there should be)

z= seq1.chars.zip(seq2.chars).each_with_index.map{|(s1,s2),index| index+1 if s1!=s2}.compact

z是位置,其中两个字符串不同的阵列。在这种情况下,返回ž2

z is an array of positions where the two strings are different. In this case z returns 2

想象一下,我添加了一个新的字符串

Imagine that I add a new string

s3 = "AGCT"

和我想它与别人比较,看看那里的3串不同。我们可以做同样的方法同上,但这次

and I wish to compare it with the the others and see where the 3 strings differ. We could do the same approach as above but this time

s1.chars.zip(s2.chars,s3.chars)

返回数组的数组。给定两个字符串,我在转播只是比较两个字符是否相等,但我添加多个字符串它开始变得势不可挡,并作为字符串变得更长。

returns an array of arrays. Given two strings I was relaying on just comparing two chars for equality, but as I add more strings it starts to become overwhelming and as the strings become longer.

#=> [["A", "A", "A"], ["C", "T", "G"], ["C", "C", "C"], ["T", "T", "T"]]

运行

s1.chars.zip(s2.chars,s3.chars).each_with_index.map{|item| item.uniq}

 #=> [["A"], ["C", "T", "G"], ["C"], ["T"]] 

可以帮助减少冗余和返回是完全一样的(大小为1的非空的子数组)的位置。然后我可以打印出的大小> 1

can help reduce redundancy and return positions that are exactly the same(non empty subarray of size 1). I could then print out the indices and contents of the subarrays that are of size > 1.

s1.chars.zip(s2.chars,s3.chars,s4.chars).each_with_index.map{|item| item.uniq}.each_with_index.map{|a,index| [index+1,a] unless a.size== 1}.compact.map{|h| Hash[*h]}
#=> [{2=>["C", "T", "G"]}]

我觉得这将下滑到停止或很慢,因为我增加串的数量,为的字符串长度变长。什么是最佳的这样一些替代的方法呢? 谢谢。

I feel that this will glide to a halt or get slow as I increase the number of strings and as the string lengths get longer. What are some alternative ways of optimally doing this? Thank you.

推荐答案

在此处,我会开始。我特意用不同的字符串,使其更容易看到的差异:

Here's where I'd start. I'm purposely using different strings to make it easier to see the differences:

str1 = 'jackdaws love my giant sphinx of quartz'
str2 = 'jackdaws l0ve my gi4nt sphinx 0f qu4rtz'

要得到第一个字符串的字符:

To get the first string's characters:

str1.chars.with_index.to_a - str2.chars.with_index.to_a
=> [["o", 10], ["a", 19], ["o", 30], ["a", 35]]

要获得第二个字符串的字符:

To get the second string's characters:

str2.chars.with_index.to_a - str1.chars.with_index.to_a
=> [["0", 10], ["4", 19], ["0", 30], ["4", 35]]

会有一点慢下来的弦变大,但也不会差。

There will be a little slow down as the strings get bigger, but it won't be bad.

编辑:增加了更多的信息。

Added more info.

如果你有一个字符串的任意数,需要比较他们,使用阵列号组合

If you have an arbitrary number of strings, and need to compare them all, use Array#combination:

str1 = 'ACCT'
str2 = 'ATCT'
str3 = 'AGCT'

require 'pp'

pp [str1, str2, str3].combination(2).to_a
>> [["ACCT", "ATCT"], ["ACCT", "AGCT"], ["ATCT", "AGCT"]]

在上面的输出中可以看到,组合通过数组循环,返回的各种 N 大组合数组元素。

In the above output you can see that combination cycles through the array, returning the various n sized combinations of the array elements.

pp [str1, str2, str3].combination(2).map{ |a,b| a.chars.with_index.to_a - b.chars.with_index.to_a }
>> [[["C", 1]], [["C", 1]], [["T", 1]]]

使用组合的输出你可以遍历数组,比较针对对方的所有元素。所以,在上述返回的数组中,在ACCT和ATCT对,C是两者之间的差别,位于字符串中的位置1。同样,在会计学和AGCT不同的是C了,在位置1。最后的'ATCT和AGCT这是'T'的位置是1

Using combination's output you could cycle through the array, comparing all the elements against each other. So, in the above returned array, in the "ACCT" and "ATCT" pair, 'C' was the difference between the two, located at position 1 in the string. Similarly, in "ACCT" and "AGCT" the difference is "C" again, in position 1. Finally for 'ATCT' and 'AGCT' it's 'T' at position 1.

由于我们的时​​间越长字符串样本的code将返回多个改变字符已经看到了,这应该让你pretty的接近。

Because we already saw in the longer string samples that the code will return multiple changed characters, this should get you pretty close.

这篇关于相比同等长度的字符串,并指出其中发生分歧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆