强制解析可选组 [英] Force parsing optional groups

查看:86
本文介绍了强制解析可选组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试制作一个从报表文件中提取数据的正则表达式字符串.棘手的部分是,我需要此单个正则表达式字符串来匹配多种报告文件内容格式.我希望即使找不到一些可选组,正则表达式也总是匹配.

I'm trying to make a regex string that extracts data from report files. The tricky part is that I need this single regex string to match multiple report file content formats. I want the regex to always match even if some optional groups are not found.

获取以下报告文件的内容(注意:#2缺少"val2"部分.):

Take the following report files content (Note: #2 is missing the "val2" part.):

  • 文件#1:" -val1-test-val2-result-val3-done-"
    • 预期结果:
      • Val1组:测试
      • Val2组:结果
      • Val3组:完成
  • File #1: "-val1-test-val2-result-val3-done-"
    • Expected Result:
      • Val1 Group: test
      • Val2 Group: result
      • Val3 Group: done
  • 预期结果:
    • Val1组:测试
    • Val2组:(空)
    • Val3组:完成
    • Expected Result:
      • Val1 Group: test
      • Val2 Group: (empty)
      • Val3 Group: done

      我尝试了以下正则表达式字符串:

      I tried the following regex strings :

      Regex #1(Normal): "-val1-(?<val1>.+?)-val2-(?<val2>.+?)-val3-(?<val3>.+?)-"
      

      问题:文件#1工作正常,但在文件#2上,正则表达式不匹配,因此我没有任何组值.

      Problem: File #1 works fine but on file #2, the regex is not matching so I don't have any group values.

      Regex #2(Non greedy)): "-val1-(?<val1>.+?)(-val2-(?<val2>.+?))?-val3-(?<val3>.+?)-"
      Regex #3(Boolean OR): "-val1-(?<val1>.+?)(-val2-(?<val2>.+?)|(.*?))-val3-(?<val3>.+?)-"
      Regex #4(Conditionnal): "-val1-(?<val1>.+?)(?(-val2-(?<val2>.+?))|(.+?))-val3-(?<val3>.+?)-"
      Regex #5(Conditionnal): "-val1-(?<val1>.+?)(?(-val2-(?<val2>.+?))(-val2-(?<val2>.+?)))-val3-(?<val3>.+?)-"
      Regex #6(Conditionnal): "-val1-(?<val1>.+?)(?(-val2-(?<val2>.+?))(-val2-(?<val2>.+?))|(.+?))-val3-(?<val3>.+?)-"
      

      问题:文件#2可以按预期工作,但文件#1的val2组始终为空.

      Problem: File #2 works as expected but the val2 group of file #1 is always empty.

      结论:行为似乎是,即使存在可选组,正则表达式也会将空组值优先于当前值.有没有办法强制获得可选组的值,而在可选组不存在时才返回(空)?

      Conclusion: The behavior seems to be that even if an optional group is present, the regex will prioritize an empty group value over the present value. Is there a way to force getting the optional groups' value when they are present and only return (empty) when they're not?

      注意:我正在使用最新的.NET框架,该代码将移植到Java(Android).我试图避免对性能和带宽问题使用多个操作.

      Note: I'm using the latest .NET framework and the code will ported to Java(Android). I'm trying to avoid using multiple operations for performance and bandwidth concerns.

      有人可以帮我吗?

      推荐答案

      如果我们做一些假设,就有可能:

      It is possible if we make some assumptions:

      1. 值可能会丢失,但它们始终处于相同的顺序
      2. 第一个值始终存在
      3. 我们要寻找的零件前后有一个定界符

       

      -val1-([^-]+)(?:-val2-([^-]+)|)(?:-val3-([^-]+)|)-
      

      https://regex101.com/r/yY6vF9/1

      这篇关于强制解析可选组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆