使用remote-exec Provisioner时,instance_count大于2时Terraform卡住 [英] Terraform stucks when instance_count is more than 2 while using remote-exec provisioner

查看:166
本文介绍了使用remote-exec Provisioner时,instance_count大于2时Terraform卡住的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  • 我正在尝试使用null_resource通过Terraform的remote-exec供应器供应多个Windows EC2实例.

$ terraform -v Terraform v0.12.6 provider.aws v2.23.0 provider.null v2.1.2

$ terraform -v Terraform v0.12.6 provider.aws v2.23.0 provider.null v2.1.2

  • 最初,我与三个远程执行预配置程序一起工作(其中两个涉及重新启动实例),而没有null_resource和单个实例,一切工作都很好.
  • 然后,我需要增加计数,并基于几个链接,最终使用null_resource. 因此,我已将问题减少到无法使用null_resource为两个以上的Windows EC2实例运行一个远程执行预配器的程度.
  • Originally, I was working with three remote-exec provisioners (Two of them involved rebooting the instance) without null_resource and for a single instance, everything worked absolutely fine.
  • I then needed to increase the count and based on several links, ended up using null_resource. So, I have reduced the issue to the point where I am not even able to run one remote-exec provisioner for more than 2 Windows EC2 instances using null_resource.

Terraform模板以重现错误消息:

//VARIABLES

variable "aws_access_key" {
  default = "AK"
}
variable "aws_secret_key" {
  default = "SAK"
}
variable "instance_count" {
  default = "3"
}
variable "username" {
  default = "Administrator"
}
variable "admin_password" {
  default = "Password"
}
variable "instance_name" {
  default = "Testing"
}
variable "vpc_id" {
  default = "vpc-id"
}

//PROVIDERS
provider "aws" {
  access_key = "${var.aws_access_key}"
  secret_key = "${var.aws_secret_key}"
  region     = "ap-southeast-2"
}

//RESOURCES
resource "aws_instance" "ec2instance" {
  count         = "${var.instance_count}"
  ami           = "Windows AMI"
  instance_type = "t2.xlarge"
  key_name      = "ec2_key"
  subnet_id     = "subnet-id"
  vpc_security_group_ids = ["${aws_security_group.ec2instance-sg.id}"]
  tags = {
    Name = "${var.instance_name}-${count.index}"
  }
}

resource "null_resource" "nullresource" {
  count = "${var.instance_count}"
  connection {
    type     = "winrm"
    host     = "${element(aws_instance.ec2instance.*.private_ip, count.index)}"
    user     = "${var.username}"
    password = "${var.admin_password}"
    timeout  = "10m"
  }
   provisioner "remote-exec" {
     inline = [
       "powershell.exe Write-Host Instance_No=${count.index}"
     ]
   }
//   provisioner "local-exec" {
//     command = "powershell.exe Write-Host Instance_No=${count.index}"
//   }
//   provisioner "file" {
//       source      = "testscript"
//       destination = "D:/testscript"
//   }
}
resource "aws_security_group" "ec2instance-sg" {
  name        = "${var.instance_name}-sg"
  vpc_id      = "${var.vpc_id}"


//   RDP
  ingress {
    from_port   = 3389
    to_port     = 3389
    protocol    = "tcp"
    cidr_blocks = ["CIDR"]
    }

//   WinRM access from the machine running TF to the instance
  ingress {
    from_port   = 5985
    to_port     = 5985
    protocol    = "tcp"
    cidr_blocks = ["CIDR"]
    }

  tags = {
    Name        = "${var.instance_name}-sg"
  }

}
//OUTPUTS
output "private_ip" {
  value = "${aws_instance.ec2instance.*.private_ip}"
}

观察:

  • With one remote-exec provisioner, it works fine if count is set to 1 or 2. With count 3, it's unpredictable that all the provisioners will run everytime on all the instances. However one thing is for sure that Terraform never completes and does not show the output variables. It keeps showing "null_resource.nullresource[count.index]: Still creating..."
  • For the local-exec provisioner - Everything works fine. Tested with count's value as 1, 2 and 7.
  • For file provisioner its working fine for 1, 2 and 3 however does not finish for 7 but the file was copied on all the 7 instances. It keeps showing "null_resource.nullresource[count.index]: Still creating..."
  • Also, in every attempt, remote-exec provisioner is able to connect to the instances irrespective of count's value and it's just that, it's doesnt trigger the inline command and randomly chooses to skip that and starts showing "Still creating..." message.
  • I have been stuck with this issue for quite some time now. Couldnt find anything significant in debug logs as well. I know Terraform is not recommended to be used as a config mgmt tool however, everything's working fine even with complex provisioning scripts if the instance count is just 1 (Even without null_resource) which indicates that it should be easily possible for Terraform to handle such a basic provisioning requirement.
  • TF_DEBUG logs:
  • count=2, TF completes successfully and shows Apply complete!.
  • count=3, TF runs the remote-exec on all the three instances however does not complete and doesn't not show the outputs variables. Stuck at "Still creating..."
  • count=3, TF runs the remote-exec only on two instances and skips on nullresource[1] , does not complete and doesn't not show the outputs variables. Stuck at "Still creating..."
  • Any pointers will be greatly appreciated!

推荐答案

更新:最终的诀窍是按照v11.14 github.com/hashicorp/terraform/issues/22006#issuecomment-509588621"rel =" nofollow noreferrer>问题评论.

Update: what eventually did the trick was downgrading Terraform to v11.14 as per this issue comment.

您可以尝试的一些操作:

A few things you can try:

  1. 内联remote-exec:

resource "aws_instance" "ec2instance" {
  count         = "${var.instance_count}"
  # ...
  provisioner "remote-exec" {
    connection {
      # ...
    }
    inline = [
      # ...
    ]
  }
}

现在,您可以参考

Now you can refer to self inside the connection block to get the instance's private IP.

  1. triggers添加到null_resource:
  1. Add triggers to null_resource:

resource "null_resource" "nullresource" {
  triggers {
    host    = "${element(aws_instance.ec2instance.*.private_ip, count.index)}" # Rerun when IP changes
    version = "${timestamp()}" # ...or rerun every time
  }
  # ...
}

您可以使用 triggers属性重新创建null_resource,从而重新执行remote-exec.

You can use the triggers attribute to recreate null_resource and thus re-execute remote-exec.

这篇关于使用remote-exec Provisioner时,instance_count大于2时Terraform卡住的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆