查找字符串是否混合的最有效方法 [英] Most efficient way to find if a string is mixedCase

查看:106
本文介绍了查找字符串是否混合的最有效方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我的字符串很长,我想看看一列是allLower,allUpper还是mixedCase.例如下面的列

Suppose I have very long strings and I want to see if a column is allLower, allUpper, or mixedCase. For example with the following column

text
"hello"
"New"
"items"
"iTem12"
"-3nXy"

文本为mixedCase.确定这一点的幼稚算法可能是:

The text would be mixedCase. A naive algorithm to determine this might be:

int is_mixed_case, is_all_lower, is_all_upper;
int has_lower = 0;
int has_upper = 0;
// for each row...for each column...
for (int i = 0; (c=s[i]) != '\0'; i++) {
    if (c >='a' && c <= 'z') {
        has_lower = 1;
        if (has_upper) break;
    }
    else if (c >='A' && c <= 'Z') {
        has_upper = 1;
        if (has_lower) break;
    }
}

is_all_lower = has_lower && !has_upper;
is_all_upper = has_upper && !has_lower;
is_mixed_case = has_lower && has_upper;

但是,我敢肯定会有更高效的方法来做到这一点.进行此算法/计算的最有效方法是什么?

I'm sure there would be a more performant way to do this, however. What might be the most efficient way to do this algorithm/calculation?

推荐答案

如果您知道将要使用的字符编码(我已经使用过

If you know the character encoding that's going to be used (I've used ISO/IEC 8859-15 in the code example), a look-up table may be the fastest solution. This also allows you to decide which characters from the extended character set, such as µ or ß, you'll count as upper case, lower case or non-alphabetical.

char test_case(const char *s) {
    static const char alphabet[] = {
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,  //  ABCDEFGHIJKLMNO
        1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,  // PQRSTUVWXYZ
        0,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,  //  abcdefghijklmno
        2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,0,  // pqrstuvwxyz
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,0,0,0,0,0,1,0,2,0,2,0,0,0,0,  //        Š š ª
        0,0,0,0,0,1,2,0,0,2,0,2,0,1,2,1,  //      Žµ  ž º ŒœŸ
        1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,  // ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏ
        1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,  // ÐÑÒÓÔÕÖ ØÙÚÛÜÝÞß
        2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,  // àáâãäåæçèéêëìíîï
        2,2,2,2,2,2,2,0,2,2,2,2,2,2,2,2}; // ðñòóôõö øùúûüýþÿ
    char cases = 0;
    while (*s && cases != 3) {
        cases |= alphabet[(unsigned char) *s++];
    }
    return cases; // 0 = none, 1 = upper, 2 = lower, 3 = mixed
}

根据 chux 的注释中的建议,您可以将alphabet[0]的值设置为4,然后在while循环中只需一个条件cases < 3.

As suggested in a comment by chux, you can set the value of alphabet[0] to 4, and then you need only one condition cases < 3 in the while loop.

这篇关于查找字符串是否混合的最有效方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆