计算两个NSStrings之间的差异数 [英] Calculate number of differences between two NSStrings

查看:136
本文介绍了计算两个NSStrings之间的差异数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何计算两个NSStrings之间的差异数。

How can I calculate the number of differences between two NSStrings.

示例:

NSString 1 = "this is a string"

NSString 2 = "Tihs isa string"

应返回:4(一个用于大写T,一个用于i,一个用于h并用于缺少空格)

should return: 4 (one for the capital "T", one for the "i", the "h" and for the missing space)

推荐答案

您要查找的是 Levenshtein距离

Objective-C中的实现:

An implementation in Objective-C:

------------------------------------------------------------------------ 

//
//  NSString-Levenshtein.h
//
//  Created by Rick Bourner on Sat Aug 09 2003.
//  rick@bourner.com

@interface NSString(Levenshtein)

// calculate the smallest distance between all words in stringA and stringB
- (float) compareWithString: (NSString *) stringB;

// calculate the distance between two string treating them each as a
// single word
- (float) compareWithWord: (NSString *) stringB;

// return the minimum of a, b and c
- (int) smallestOf: (int) a andOf: (int) b andOf: (int) c;

@end

--------------------------------------------------------------------

//
//  NSString-Levenshtein.m
//
//  Created by Rick Bourner on Sat Aug 09 2003.
//  Rick@Bourner.com

#import "NSString-Levenshtein.h"


@implementation NSString(Levenshtein)

// calculate the mean distance between all words in stringA and stringB
- (float) compareWithString: (NSString *) stringB
{
     float averageSmallestDistance = 0.0;
     float smallestDistance;
     float distance;

     NSMutableString * mStringA = [[NSMutableString alloc]  initWithString: self];
     NSMutableString * mStringB = [[NSMutableString alloc]  initWithString: stringB];


     // normalize
     [mStringA replaceOccurrencesOfString:@"\n"
                              withString: @" "
                                 options: NSLiteralSearch
                                   range: NSMakeRange(0, [mStringA  length])];

     [mStringB replaceOccurrencesOfString:@"\n"
                              withString: @" "
                                 options: NSLiteralSearch
                                   range: NSMakeRange(0, [mStringB  length])];

     NSArray * arrayA = [mStringA componentsSeparatedByString: @" "];
     NSArray * arrayB = [mStringB componentsSeparatedByString: @" "];

     NSEnumerator * emuA = [arrayA objectEnumerator];
     NSEnumerator * emuB;

     NSString * tokenA = NULL;
     NSString * tokenB = NULL;

     // O(n*m) but is there another way ?!?
     while ( tokenA = [emuA nextObject] ) {

         emuB = [arrayB objectEnumerator];
         smallestDistance = 99999999.0;

         while ( tokenB = [emuB nextObject] )
             if ( (distance = [tokenA compareWithWord: tokenB] ) <  smallestDistance )
                 smallestDistance = distance;

         averageSmallestDistance += smallestDistance;

     }

     [mStringA release];
     [mStringB release];

     return averageSmallestDistance / [arrayA count];
}


// calculate the distance between two string treating them eash as a
// single word
- (float) compareWithWord: (NSString *) stringB
{
     // normalize strings
     NSString * stringA = [NSString stringWithString: self];
     [stringA stringByTrimmingCharactersInSet:
               [NSCharacterSet whitespaceAndNewlineCharacterSet]];
     [stringB stringByTrimmingCharactersInSet:
               [NSCharacterSet whitespaceAndNewlineCharacterSet]];
     stringA = [stringA lowercaseString];
     stringB = [stringB lowercaseString];


     // Step 1
     int k, i, j, cost, * d, distance;

     int n = [stringA length];
     int m = [stringB length];  

     if( n++ != 0 && m++ != 0 ) {

         d = malloc( sizeof(int) * m * n );

         // Step 2
         for( k = 0; k < n; k++)
             d[k] = k;

         for( k = 0; k < m; k++)
             d[ k * n ] = k;

         // Step 3 and 4
         for( i = 1; i < n; i++ )
             for( j = 1; j < m; j++ ) {

                 // Step 5
                 if( [stringA characterAtIndex: i-1] == 
                      [stringB characterAtIndex: j-1] )
                     cost = 0;
                 else
                     cost = 1;

                 // Step 6
                 d[ j * n + i ] = [self smallestOf: d [ (j - 1) * n + i ] + 1
                                             andOf: d[ j * n + i - 1 ] +  1
                                             andOf: d[ (j - 1) * n + i -1 ] + cost ];
             }

         distance = d[ n * m - 1 ];

         free( d );

         return distance;
     }
     return 0.0;
}


// return the minimum of a, b and c
- (int) smallestOf: (int) a andOf: (int) b andOf: (int) c
{
     int min = a;
     if ( b < min )
         min = b;

     if( c < min )
         min = c;

     return min;
}

@end

Rick Bourner, http://www.merriampark.com/ldobjc.htm

Author of the source above: Rick Bourner, http://www.merriampark.com/ldobjc.htm

这篇关于计算两个NSStrings之间的差异数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆