如何自检Perl API中的正则表达式 [英] How to introspect regexes in the Perl API

查看:77
本文介绍了如何自检Perl API中的正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一些需要序列化Perl正则表达式的代码,包括任何正则表达式标志.仅支持标志的子集,因此我需要检测正则表达式对象中何时不支持的标志(例如/u).

I'm working on some code that needs to serialize Perl regexes, including any regex flags. Only a subset of flags are supported, so I need to detect when unsupported flags like /u are in the regex object.

当前代码版本执行此操作

The current version of the code does this:

static void serialize_regex_flags(buffer *buf, SV *sv) {
  char flags[] = {0,0,0,0,0,0};
  unsigned int i = 0, f = 0;
  STRLEN string_length;
  char *string = SvPV(sv, string_length);

然后手动逐个字符处理string来查找标志.

Then manually processes string char-by-char to find flags.

这里的问题是正则表达式标志的字符串化从(例如,我认为在Perl 5.14中) (?i-xsm:foo)(?^i:foo),这使得解析很麻烦.

The problem here is that the stringification of regex flags changed (I think in Perl 5.14) from e.g. (?i-xsm:foo) to (?^i:foo), which makes parsing a pain.

我可以检查perl的版本,也可以只编写解析器来处理这两种情况,但是有些信息告诉我,必须有一种更好的自省方法.

I could check the version of perl, or just write the parser to handle both cases, but something tells me there must be a superior method of introspection available.

推荐答案

在Perl中,您将使用re::regexp_pattern.

In Perl, you'd use re::regexp_pattern.

 my $re = qr/foo/i;
 my ($pat, $mods) = re::regexp_pattern($re);
 say $pat;   # foo
 say $mods;  # i

regexp_pattern ,API中没有获取该信息的函数,因此我建议您也从XS调用该函数.

As you can see from the source of regexp_pattern, there's no function in the API to obtain that information, so I recommend that you call that function too from XS too.

perlcall 涵盖了从C调用Perl函数的过程.我想到了以下未经测试的代码:

perlcall covers calling Perl functions from C. I came up with the following untested code:

/* Calls re::regexp_pattern to extract the pattern
 * and flags from a compiled regex.
 *
 * When re isn't a compiled regex, returns false,
 * and *pat_ptr and *flags_ptr are set to NULL.
 *
 * The caller must free() *pat_ptr and *flags_ptr.
 */

static int regexp_pattern(char ** pat_ptr, char ** flags_ptr, SV * re) {
   dSP;
   int count;
   ENTER;
   SAVETMPS;
   PUSHMARK(SP);
   XPUSHs(re);
   PUTBACK;
   count = call_pv("re::regexp_pattern", G_ARRAY);
   SPAGAIN;

   if (count == 2) {
      /* Pop last one first. */
      SV * flags_sv = POPs;
      SV * pat_sv   = POPs;

      /* XXX Assumes no NUL in pattern */
      char * pat   = SvPVutf8_nolen(pat_sv); 
      char * flags = SvPVutf8_nolen(flags_sv);

      *pat_ptr   = strdup(pat);
      *flags_ptr = strdup(flags);
   } else {
      *pat_ptr   = NULL;
      *flags_ptr = NULL;
   }

   PUTBACK;
   FREETMPS;
   LEAVE;

   return *pat_ptr != NULL;
}

用法:

SV * re = ...;

char * pat;
char * flags;
regexp_pattern(&pat, &flags, re);

这篇关于如何自检Perl API中的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆