正则表达式模式以匹配terraform模块代码 [英] Regex pattern to match terraform module code

查看:181
本文介绍了正则表达式模式以匹配terraform模块代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用正则表达式来匹配terraform模块并将注释添加到该行的开头.我不能仅将正则表达式用于模块块.请注意,某些行确实会在其他块(如资源)上重复.想法是扫描模块块并对其进行注释.任何帮助将不胜感激.花了很多时间跳动想法...

I am trying to use regex to match terraform module and add a comment to the beginning of the line. I am not able to use regex for the module block only. Note that some lines do repeat on other blocks like resource. The idea is to scan for module block and comment it. Any help will be greatly appreciated. Spent a lot of time bouncing ideas...

module my module {
name = myaws
version = 1.0
source = terraform.mycompany.com
tag = { cost = poc }
}

data "my file" "file-name-creation-data" {
  template = file("path/file.json")
}

resource aws_iam_role_policy "my-role" {
 name = "first-policy"
 role = new role.rolename
 tag = { cost = pic }
}

推荐答案

Terraform语言不是常规语言,因此没有使用正则表达式进行处理的通用方法.

The Terraform language is not a regular language and so there is no fully-general way to process it with regular expressions.

但是,该语言对其块语法有一些限制,这意味着您可能会编写一种足够好"的启发式方法,以应对大多数情况(但仍不是全部).以下是一些有关Terraform语言的有用事实,可以帮助您稍微限制一下问题:

However, the language has some constraints on its block syntax that mean you can potentially write a "good enough" heuristic that will deal with most cases (but still not all). Here are some useful facts about the Terraform language that can help constrain the problem a little:

  • 块的开头必须始终都出现在同一行上,包括开头大括号.在module关键字和{大括号之间包含其他换行符是无效的.

  • The opening of a block must always appear all on the same line, including the opening brace. It's not valid to include additional newlines between the module keyword and the { brace.

有两种写块的方法:

  • 标题的常规布局应位于其自身的一行上,并以引入块体的开头括号结尾:{.
  • 紧凑的单行布局将整个块都放在一行上,并且像module "foo" { source = "./bar" }一样在其中包含一个自变量.
  • Normal layout is for the header to be on a line of its own, ending with the opening brace that introduces the block body: {.
  • The compact single-line layout has the entire block on one line, with a single argument inside like module "foo" { source = "./bar" }.

常规布局中的块的右括号始终在一行上.

The closing brace for a block in normal layout is always on a line of its own.

当然,还有一些不太方便的事实:

There are some not-so-convenient facts too, of course:

  • Terraform还在其对象构造函数表达式中使用大括号,因此天真地寻找开括号和闭合大括号会同时发现块边界和对象构造函数边界.

  • Terraform also uses braces for its object constructor expressions, so naively hunting for opening and closing braces will find both block boundaries and object constructor boundaries.

字符串模板语法使用${%{作为其开始定界符,但使用}作为其结束定界符,增加了闭合括号的第三个含义.

The string template syntax uses ${ or %{ as its opening delimiters, but it uses } as its closing delimiters, adding a third meaning of a closing brace.

"heredoc"语法脱离了常规的解析规则,意味着可以出现任意数量的花括号(不需要保持平衡).但是,它们始终以<<<<-开头,并在行尾添加一个标识符,然后在其自己的行上以相同的标识符结尾.

The "heredoc" syntax escapes from the normal parsing rules and means that arbitrary numbers of braces (that do not need to be balanced) can appear. But they always start with a << or <<- followed by an identifier at the end of a line and then end with that same identifier on a line of its own.

话虽如此,如果您可以控制输入并确保输入中不包含边缘情况",例如块标题中间的注释,包含看起来像模块块的Heredoc序列等,那么您可以能够通过逐行处理输入来获得足够好"的结果:

With all of that said, if you have control over the input and can ensure it will not include "edge cases" like comments in the middle of block headers, heredoc sequences containing what looks like a module block, etc then you may be able to get a "good enough" result by processing the input on a line-by-line basis:

  • 让B = 0
  • 对于输入中的每一行:
    • 如果B为零:
    • 如果该行与^module ["\w- ]*{相匹配,则对模块块执行任何要执行的操作.
    • 对于该行中的每个字符:
    • 如果字符是{,则增加B
    • 如果字符是},则将B减1.
    • Let B = 0
    • For each line in the input:
      • If B is zero:
      • If the line matches ^module ["\w- ]*{ then take whatever action you want to take for a module block.
      • For each character in the line:
      • If character is { then increment B
      • If character is } then decrement B

      这使用幼稚的括号计数方法来近似地找到块的边界.如果输入中包含带有不平衡花括号的文字字符串(带引号或heredoc),它将失败,因此您也可以尝试通过计算打开/关闭引号和heredoc标记对来改善这一点.

      This uses a naive brace-counting approach to approximate finding the boundaries of blocks. It will fail if the input contains literal string (either quoted or heredoc) with unbalanced braces inside it, so you might try to improve on that by counting open/close quote and heredoc marker pairs too.

      缺少该语言的完整解析器,总会遇到一些无法解决的极端情况,但是如果您限制输入内容不包含任何更简单的规则集无法理解的情况,那么这种方法就可以解决.以上可能对您有用.

      Anything short of a full parser for the language will always have some edge-case it can't handle, but if you can constrain your input to not include any situation that your simpler ruleset can't understand then an approach like the above might work for you.

      如果您愿意用Go编写程序,则可以使用 hclwrite 软件包,它是Terraform用于实现其语言语法的基础库的一部分.它具有完整的解析器,并允许对其读取的内容进行外科手术"编辑,尽管在我撰写本文时,它似乎还没有用于向块添加注释的功能,因此它目前尚未准备好在此处解决您的特定目标

      If you were willing to write your program in Go then you'd be able to use the hclwrite package which is a part of the underlying library Terraform uses to implement its language syntax. It has a full parser and allows making "surgical" edits to what it reads, though at the time I write this it doesn't seem to have functions for adding comments to blocks in particular so it's not currently ready to solve your specific goal here.

      这对于将来有其他目的,与修改现有Terraform配置有关的其他目标的人可能有用,并且将来可能会获得其他功能来支持其他用例.

      It might be useful to others who find this question in future that have other goals related to modifying existing Terraform configurations, and it may get additional functionality to support other use-cases in the future.

      这篇关于正则表达式模式以匹配terraform模块代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆