如何检查两个格式字符串是否兼容? [英] How to check that two format strings are compatible?

查看:92
本文介绍了如何检查两个格式字符串是否兼容?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

示例:

"Something %d"        and "Something else %d"       // Compatible
"Something %d"        and "Something else %f"       // Not Compatible
"Something %d"        and "Something %d else %d"    // Not Compatible
"Something %d and %f" and "Something %2$f and %1$d" // Compatible

我认为应该有一些C函数,但是没有得到任何相关的搜索结果.我的意思是编译器正在检查格式字符串和参数是否匹配,因此已经编写了用于检查此格式的代码.唯一的问题是我怎么称呼它.

I figured there should be some C function for this, but I'm not getting any relevant search results. I mean the compiler is checking that the format string and the arguments match, so the code for checking this is already written. The only question is how I can call it.

我正在使用Objective-C,因此如果有针对Objective-C的解决方案也很好.

I'm using Objective-C, so if there is an Objective-C specific solution that's fine too.

推荐答案

检查2个printf()格式字符串是否兼容是格式解析的一项工作.

Checking if 2 printf() format strings are compatible is an exercise in format parsing.

C至少没有标准的运行时比较功能,例如:

C, at least, has no standard run-time compare function such as:

int format_cmp(const char *f1, const char *f2); // Does not exist


"%d %f""%i %e"这样的格式显然是兼容的,因为它们都希望使用intfloat/double.注意:float提升为double,而shortsigned char提升为int.


Formats like "%d %f" and "%i %e" are obviously compatible in that both expect an int and float/double. Note: float are promoted to double as short and signed char are promoted to int.

格式"%*.*f""%i %d %e"是兼容的,但并不明显:都期望intintfloat/double.

Formats "%*.*f" and "%i %d %e" are compatible, but not obvious: both expect an int,int and float/double.

格式"%hhd""%d"都希望使用int,即使第一个在打印前将其值强制转换为signed char.

Formats "%hhd" and "%d" both expect an int, even though the first will have it values cast to signed char before printing.

格式"%d""%u"不兼容.即使许多系统的表现都符合预期.注意:通常,char将升级为int.

Formats "%d" and "%u" are not compatible. Even though many systems will behaved as hoped. Note: Typically char will promote to int.

格式"%d""%ld"严格不兼容.在32位系统上,它是等效的,但一般而言不是.当然可以更改代码以适应此情况. OTOH "%lf""%f" 兼容,因为通常将float提升为double.

Formats "%d" and "%ld" are not strictly compatible. On a 32-bit system there are equivalent, but not in general. Of course code can be altered to accommodate this. OTOH "%lf" and "%f" are compatible due to the usual argument promotions of float to double.

格式"%lu""zu" 可能兼容,但这取决于unsigned longsize_t的实现.添加代码可以实现此等价方式.

Formats "%lu" and "zu" may be compatible, but that depends on the implementation of unsigned long and size_t. Additions to code could allow this or related equivalences.

修饰符和说明符的某些组合没有像"zp"那样定义.以下内容并不禁止这种深奥的组合-而是对它们进行了比较.

Some combinations of modifiers and specifiers are not defined like "zp". The following does not dis-allow such esoteric combinations - but does compare them.

"$"这样的修饰符是对标准C的扩展,在下文中未实现.

Modifiers like "$" are extensions to standard C and are not implemented in the following.

printf()的兼容性测试不同于scanf().

The compatibility test for printf() differs from scanf().

#include <ctype.h>
#include <limits.h>
#include <stdio.h>
#include <string.h>

typedef enum {
  type_none,
  type_int,
  type_unsigned,
  type_float,
  type_charpointer,
  type_voidpointer,
  type_intpointer,
  type_unknown,
  type_type_N = 0xFFFFFF
} type_type;

typedef struct {
  const char *format;
  int int_queue;
  type_type type;
} format_T;

static void format_init(format_T *state, const char *format);
static type_type format_get(format_T *state);
static void format_next(format_T *state);

void format_init(format_T *state, const char *format) {
  state->format = format;
  state->int_queue = 0;
  state->type = type_none;
  format_next(state);
}

type_type format_get(format_T *state) {
  if (state->int_queue > 0) {
    return type_int;
  }
  return state->type;
}

const char *seek_flag(const char *format) {
  while (strchr("-+ #0", *format) != NULL)
    format++;
  return format;
}

const char *seek_width(const char *format, int *int_queue) {
  *int_queue = 0;
  if (*format == '*') {
    format++;
    (*int_queue)++;
  } else {
    while (isdigit((unsigned char ) *format))
      format++;
  }
  if (*format == '.') {
    if (*format == '*') {
      format++;
      (*int_queue)++;
    } else {
      while (isdigit((unsigned char ) *format))
        format++;
    }
  }
  return format;
}

const char *seek_mod(const char *format, int *mod) {
  *mod = 0;
  if (format[0] == 'h' && format[1] == 'h') {
    format += 2;
  } else if (format[0] == 'l' && format[1] == 'l') {
    *mod = ('l' << CHAR_BIT) + 'l';
    format += 2;
  } else if (strchr("ljztL", *format)) {
    *mod = *format;
    format++;
  } else if (strchr("h", *format)) {
    format++;
  }
  return format;
}

const char *seek_specifier(const char *format, int mod, type_type *type) {
  if (strchr("di", *format)) {
    *type = type_int;
    format++;
  } else if (strchr("ouxX", *format)) {
    *type = type_unsigned;
    format++;
  } else if (strchr("fFeEgGaA", *format)) {
    if (mod == 'l') mod = 0;
    *type = type_float;
    format++;
  } else if (strchr("c", *format)) {
    *type = type_int;
    format++;
  } else if (strchr("s", *format)) {
    *type = type_charpointer;
    format++;
  } else if (strchr("p", *format)) {
    *type = type_voidpointer;
    format++;
  } else if (strchr("n", *format)) {
    *type = type_intpointer;
    format++;
  } else {
    *type = type_unknown;
    exit(1);
  }
  *type |= mod << CHAR_BIT; // Bring in modifier
  return format;
}

void format_next(format_T *state) {
  if (state->int_queue > 0) {
    state->int_queue--;
    return;
  }
  while (*state->format) {
    if (state->format[0] == '%') {
      state->format++;
      if (state->format[0] == '%') {
        state->format++;
        continue;
      }
      state->format = seek_flag(state->format);
      state->format = seek_width(state->format, &state->int_queue);
      int mod;
      state->format = seek_mod(state->format, &mod);
      state->format = seek_specifier(state->format, mod, &state->type);
      return;
    } else {
      state->format++;
    }
  }
  state->type = type_none;
}

// 0 Compatible
// 1 Not Compatible
// 2 Not Comparable
int format_cmp(const char *f1, const char *f2) {
  format_T state1;
  format_init(&state1, f1);
  format_T state2;
  format_init(&state2, f2);
  while (format_get(&state1) == format_get(&state2)) {
    if (format_get(&state1) == type_none)
      return 0;
    if (format_get(&state1) == type_unknown)
      return 2;
    format_next(&state1);
    format_next(&state2);
  }
  if (format_get(&state1) == type_unknown)
    return 2;
  if (format_get(&state2) == type_unknown)
    return 2;
  return 1;
}

注意:仅进行了最少的测试.可以添加很多其他注意事项.

Note: only minimal testing done. Lots of additional considerations could be added.

已知的缺点:hh,h,l,ll,j,z,t带有n的修饰符. ls,c.

Known shortcomings: hh,h,l,ll,j,z,t modifiers with n. l with s,c.

OP有关安全性问题的评论.这就改变了职位和比较的性质,从平等的人变成了安全的人.我猜想其中一个模式(A)将是参考模式,下一个(B)将是测试.测试将是"B至少和A一样安全吗?".示例A = "%.20s"B1 = "%.19s"B2 = "%.20s"B3 = "%.21s". B1B2都通过了安全性测试,因为它们提取的20 char不多. B3是一个问题,因为它超过了20 char的参考极限.进一步的 any %s %[ %c限定的非宽度是一个安全问题-在参考或测试模式中.此答案的代码无法解决此问题.

OP comments about security concerns. This changes the nature of the post and the compare from an equality one to a security one. I'd imagine that one of the patterns (A) would be a reference pattern and the next (B) would be the test. The test would be "is B at least as secure as A?". Example A = "%.20s" and B1 = "%.19s", B2 = "%.20s", B3 = "%.21s". B1 and B2 both pass the security test as they do not extract more the 20 char. B3 is a problem as it goes pass the reference limit of 20 char. Further any non-width qualified with %s %[ %c is a security problem - in the reference or test pattern. This answer's code does not address this issue.

如前所述,代码尚未使用"%n"处理修饰符.

As mentioned, code does not yet handle modifiers with "%n".

[2018编辑]

关于格式"%d""%u"不兼容.":这通常用于打印值.对于[0..INT_MAX]范围内的值,每种格式都可以按照C11dr§6.5.2.26进行操作.

Concerning "Formats "%d" and "%u" are not compatible.": This is for values to be printed in general. For values in the [0..INT_MAX] range, either format may work per C11dr §6.5.2.2 6.

这篇关于如何检查两个格式字符串是否兼容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆