为什么这个程序在 Python 中比 Objective-C 更快? [英] Why is this program faster in Python than Objective-C?

查看：68 发布时间：2021/6/9 20:15:10 python objective-c nsstring

本文介绍了为什么这个程序在 Python 中比 Objective-C 更快?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我对这个小例子 Python 中用于循环遍历大型单词列表的算法.我正在编写一些工具"，它们将允许我以与 Python 类似的方式对 Objective-C 字符串或数组进行切片.

I got interested in this small example of an algorithm in Python for looping through a large word list. I am writing a few "tools" that will allow my to slice a Objective-C string or array in a similar fashion as Python.

具体来说，这个优雅的解决方案引起了我的注意，它执行速度非常快，它使用字符串切片作为算法的关键元素.尝试不用切片来解决这个问题！

Specifically, this elegant solution caught my attention for executing very quickly and it uses a string slice as a key element of the algorithm. Try and solve this without a slice!

我使用下面的 Moby 单词列表复制了我的本地版本.如果您不想下载 Moby，可以使用 /usr/share/dict/words.源只是一个大型字典般的独特单词列表.

I have reproduced my local version using the Moby word list below. You can use /usr/share/dict/words if you do not feel like downloading Moby. The source is just a large dictionary-like list of unique words.

#!/usr/bin/env python

count=0
words = set(line.strip() for line in   
           open("/Users/andrew/Downloads/Moby/mwords/354984si.ngl"))
for w in words:
    even, odd = w[::2], w[1::2]
    if even in words and odd in words:
        count+=1

print count

这个脚本将 a) 被 Python 解释；b) 读取 4.1 MB、354,983 字的 Moby 词典文件；c) 剥线；d) 将线放入一个集合中，并且；e) 并找出给定单词的偶数和几率也是单词的所有组合.这在 MacBook Pro 上执行时间约为 0.73 秒.

This script will a) be interpreted by Python; b) read the 4.1 MB, 354,983 word Moby dictionary file; c) strip the lines; d) place the lines into a set, and; e) and find all the combinations where the evens and the odds of a given word are also words. This executes in about 0.73 seconds on a MacBook Pro.

我尝试在 Objective-C 中重写相同的程序.我是这门语言的初学者，所以请放轻松，但请指出错误.

I tried to rewrite the same program in Objective-C. I am a beginner at this language, so go easy please, but please do point out the errors.

#import <Foundation/Foundation.h>

NSString *sliceString(NSString *inString, NSUInteger start, NSUInteger stop, 
        NSUInteger step){
    NSUInteger strLength = [inString length];

    if(stop > strLength) {
        stop = strLength;
    }

    if(start > strLength) {
        start = strLength;
    }

    NSUInteger capacity = (stop-start)/step;
    NSMutableString *rtr=[NSMutableString stringWithCapacity:capacity];    

    for(NSUInteger i=start; i < stop; i+=step){
        [rtr appendFormat:@"%c",[inString characterAtIndex:i]];
    }
    return rtr;
}

NSSet * getDictWords(NSString *path){

    NSError *error = nil;
    NSString *words = [[NSString alloc] initWithContentsOfFile:path
                         encoding:NSUTF8StringEncoding error:&error];
    NSCharacterSet *sep=[NSCharacterSet newlineCharacterSet];
    NSPredicate *noEmptyStrings = 
                     [NSPredicate predicateWithFormat:@"SELF != ''"];

    if (words == nil) {
        // deal with error ...
    }
    // ...

    NSArray *temp=[words componentsSeparatedByCharactersInSet:sep];
    NSArray *lines = 
        [temp filteredArrayUsingPredicate:noEmptyStrings];

    NSSet *rtr=[NSSet setWithArray:lines];

    NSLog(@"lines: %lul, word set: %lul",[lines count],[rtr count]);
    [words release];

    return rtr;    
}

int main (int argc, const char * argv[])
{
    NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];

    int count=0;

    NSSet *dict = 
       getDictWords(@"/Users/andrew/Downloads/Moby/mwords/354984si.ngl");

    NSLog(@"Start");

    for(NSString *element in dict){
        NSString *odd_char=sliceString(element, 1,[element length], 2);
        NSString *even_char=sliceString(element, 0, [element length], 2);
        if([dict member:even_char] && [dict member:odd_char]){
            count++;
        }

    }    
    NSLog(@"count=%i",count);

    [pool drain];
    return 0;
}

Objective-C 版本产生相同的结果，(13,341 个字)，但需要将近 3 秒才能完成.对于编译语言比脚本语言慢 3 倍以上，我一定是在做一些非常错误的事情，但如果我能明白为什么，我会被诅咒的.

The Objective-C version produces the same result, (13,341 words), but takes almost 3 seconds to do it. I must be doing something atrociously wrong for a compiled language to be more than 3X slower than a scripted language, but I'll be darned if I can see why.

基本算法是一样的:读取行，剥离它们，然后将它们放在一个集合中.

The basic algorithm is the same: read the lines, strip them, and put them in a set.

我猜想慢的是 NSString 元素的处理，但我不知道替代方法.

My guess of what is slow is the processing of the NSString elements, but I do not know an alternative.

编辑

我将 Python 编辑成这样:

I edited the Python to be this:

#!/usr/bin/env python
import codecs
count=0
words = set(line.strip() for line in 
     codecs.open("/Users/andrew/Downloads/Moby/mwords/354984si.ngl",
          encoding='utf-8'))
for w in words:
    if w[::2] in words and w[1::2] in words:
        count+=1

print count

为了使 utf-8 与 utf-8 NSString 在同一平面上.这将 Python 减慢到 1.9 秒.

For the utf-8 to be on the same plane as the utf-8 NSString. This slowed the Python down to 1.9 secs.

我还将切片测试切换为短路类型建议适用于 Python 和 obj-c 版本.现在它们接近相同的速度.我还尝试使用 C 数组而不是 NSStrings，这要快得多，但并不容易.你也失去了对 utf-8 的支持.

I also switch the slice test to short-circuit type as suggested for both the Python and obj-c version. Now they are close to the same speed. I also tried using C arrays rather than NSStrings, and this was much faster, but not as easy. You also loose utf-8 support doing that.

Python 真的很酷...

Python is really cool...

编辑 2

我发现了一个大大加快了速度的瓶颈.我没有使用 [rtr appendFormat:@"%c",[inString characterAtIndex:i]]; 方法将一个字符附加到返回字符串，而是使用了这个:

I found a bottleneck that sped things up considerably. Instead of using the [rtr appendFormat:@"%c",[inString characterAtIndex:i]]; method to append a character to the return string, I used this:

for(NSUInteger i=start; i < stop; i+=step){
    buf[0]=[inString characterAtIndex:i];
    [rtr appendString:[NSString stringWithCharacters:buf length:1]];
}

现在我可以最后宣称 Objective-C 版本比 Python 版本快——但不会快很多.

Now I can finally claim that the Objective-C version is faster than the Python version -- but not by much.

为什么这个程序在 Python 中比 Objective-C 更快? [英] Why is this program faster in Python than Objective-C?

问题描述

推荐答案

相关文章

移动开发最新文章

热门教程

热门工具

登录关闭

为什么这个程序在 Python 中比 Objective-C 更快? [英] Why is this program faster in Python than Objective-C?

问题描述

推荐答案

相关文章

移动开发最新文章

热门教程

热门工具

登录 关闭

登录关闭