设置多个资源时,terraform中的Chef设置程序挂起 [英] Chef provisioner in terraform hangs when provisioning more than one resource

查看:120
本文介绍了设置多个资源时,terraform中的Chef设置程序挂起的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用Terraform设置多台机器并使用Terraform Chef设置器来配置机器时,只有在Terraform运行中仅对一个资源进行烹饪时,我才能使它工作。当仅针对一个VM时,一切都将完美运行。
如果配置了多个资源,则厨师运行将挂在正在创建配置文件... 步骤。

When using Terraform to provision multiple machines, and the Terraform Chef provisioner to configure the machine, I am only able to get it work if only one "resource" is being cheffed in the Terraform run. Everything works perfectly when only one VM is targeted. When more than one resource is provisioned, the chef run will hang at the Creating configuration files... step.

我尝试使用模块,在每个资源内进行配置,最近一次使用 null_resource s在创建虚拟机资源后进行配置。
null_resource 已被证明非常有用,因为它允许我快速迭代厨师运行,而不必每次都重新旋转VM资源,因为我

I have tried using modules, provisioning inside each resource, and most recently using null_resources to provision the vm resources after they've been created. (The null_resource has proven very useful, as it allows me to iterate on the chef run quickly without having to re-spin the VM resource every time, as I did when the provisioner was inside the resource block.)

这是在TF 0.11上发生的,并在v0.12中继续:

This happened on TF 0.11, and continues in v0.12:

Terraform v0.12.8
+ provider.null v2.1.2
+ provider.vra7 v0.4.1



资源内的资源调配人:



Provisioner inside the resource:

resource "vra7_deployment" "vra-vm" {
 ...
  resource_configuration = {
    "vSphere_Machine_1.name" = ""
    "vSphere_Machine_1.ip_address" = ""
    "vSphere_Machine_1.description" = "Terraform ICE SQL"
  }
  ...

  provisioner "chef" {
    # This is for TF to talk to the new node
    connection {
      host = self.resource_configuration["vSphere_Machine_1.ip_address"]
      type = "winrm"
      user = var.KT_USER
      password = var.KT_PASS
      insecure = true
    }

    # This is for TF to talk to the chef_server
    # Note! the version constraint doesn't work
    server_url = var.chef_server_url
    node_name  = "ICE-SQL-${self.resource_configuration["vSphere_Machine_1.name"]}"
    run_list   = var.sql_run_list
    recreate_client = true
    environment = "_default"
    ssl_verify_mode = ":verify_none"
    version = "~> 12"
    user_name  = local.username
    user_key   = file("${local.user_key_path}")
  }



Provisioner使用 null_resource 块:



Provisioner using null_resource block:

resource "vra7_deployment" "ICE-SQL" {
  count = var.sql_count # will be 1/on or 0/off
  ...
  resource_configuration = {
    "vSphere_Machine_1.name" = ""
    "vSphere_Machine_1.ip_address" = ""
    "vSphere_Machine_1.description" = "Terraform ICE SQL"
  }
}

locals {
    sql_ip   = vra7_deployment.ICE-SQL[0].resource_configuration["vSphere_Machine_1.ip_address"]
    sql_name = vra7_deployment.ICE-SQL[0].resource_configuration["vSphere_Machine_1.name"]
  }

resource "null_resource" "sql-chef" { 
  # we can use count to switch creating this on or off for testing
  count = 0

  provisioner "chef" {
    # This is for TF to talk to the new node
    connection {
      host = local.sql_ip
      type = "winrm"
      user = var.KT_USER
      password = var.KT_PASS
      insecure = true
    }

    # This is for TF to talk to the chef_server
    # Don't use the local var here, so TF knows to create the dependency
    server_url = var.chef_server_url
    node_name  = "ICE-SQL-${vra7_deployment.ICE-SQL[0].resource_configuration["vSphere_Machine_1.name"]}"
    run_list   = var.sql_run_list
    recreate_client = true
    environment = "_default"
    ssl_verify_mode = ":verify_none"
    version = "12"
    user_name  = local.username
    user_key   = file("${local.user_key_path}")
    client_options = var.chef_client_options
  }
}



模块



modules

### main.tf
module "SQL" {
  source   = "./modules/vra-chef"
  VRA_USER = var.VRA_USER
  VRA_PASS = var.VRA_PASS
  KT_USER  = var.KT_USER
  KT_PASS  = var.KT_PASS

  description = "ICE SQL"
  run_list    = var.sql_run_list
}

### modules/vra-chef/main.tf
resource "vra7_deployment" "vra-chef" {
  count = var.server_count
...
  resource_configuration = {
    "vSphere_Machine_1.name"       = var.resource_name
    "vSphere_Machine_1.ip_address"  = var.resource_ip
    "vSphere_Machine_1.description" = "${var.description}-${count.index}"
  }

  provisioner "chef" {
    # This is for TF to talk to the new node
    connection {
      host = self.resource_configuration["vSphere_Machine_1.ip_address"]
      type = "winrm"
      user = var.KT_USER
      password = var.KT_PASS
      insecure = true
    }

    # This is for TF to talk to the chef_server
    server_url = var.chef_server_url
    node_name  = self.resource_configuration["vSphere_Machine_1.name"]
    run_list   = var.run_list
    recreate_client = true
    environment = "_default"
    ssl_verify_mode = ":verify_none"
    version = "~> 12"
    user_name  = local.username
    user_key   = file(local.user_key_path)
    client_options = [ "chef_license  'accept'" ]

    # pass custom attributes to the new node
    attributes_json = var.input_json
  }
}



预期结果:



厨师配置它所应用的所有资源。

Expected Results:

Chef configures all resources that it is applied to.

Terraform Chef预配器将连接到它所应用的所有资源,并在客户端上安装Chef。当它进入创建配置文件时。 。步骤,它将停止发送更多更新,并且Terraform运行将每10秒更新一次状态,仍在为每个资源创建...

The Terraform Chef provisioner will connect to all resources that it is applied to, and install chef on the clients. When it gets to the creating configuration files... step, it stops sending any more updates, and the Terraform run will keep updating the status every 10s, still creating... for each resource.

vra7_deployment.ICE-REMOTE[0]: Still creating... [9m30s elapsed]
vra7_deployment.ICE-SQL[0]: Still creating... [9m30s elapsed]
vra7_deployment.ICE-MASTER[0]: Still creating... [9m30s elapsed]
vra7_deployment.ICE-MASTER[0]: Creation complete after 9m39s [id=feecf983-48d5-425e-b713-65a1a05fa3ba]
vra7_deployment.ICE-REMOTE[0]: Still creating... [9m40s elapsed]
vra7_deployment.ICE-SQL[0]: Still creating... [9m40s elapsed]
...
vra7_deployment.ICE-SQL[0]: Still creating... [12m10s elapsed]
vra7_deployment.ICE-REMOTE[0]: Still creating... [12m10s elapsed]
vra7_deployment.ICE-REMOTE[0]: Creation complete after 12m11s [id=df64f5ab-af12-4493-8e7d-d7debd93780d]
vra7_deployment.ICE-SQL[0]: Still creating... [12m20s elapsed]
...
vra7_deployment.ICE-SQL[0]: Still creating... [13m10s elapsed]
vra7_deployment.ICE-SQL[0]: Creation complete after 13m11s [id=08ec31f4-124d-470e-b2ba-1833a6f22792]
null_resource.sql-chef[0]: Creating...
null_resource.master-chef[0]: Creating...
null_resource.remote-chef[0]: Creating...
null_resource.sql-chef[0]: Provisioning with 'chef'...
null_resource.master-chef[0]: Provisioning with 'chef'...
null_resource.remote-chef[0]: Provisioning with 'chef'...
null_resource.master-chef[0] (chef): Connecting to remote host via WinRM...
null_resource.master-chef[0] (chef):   Host: 10.12.235.61
null_resource.master-chef[0] (chef):   Port: 5985
null_resource.master-chef[0] (chef):   User: engineering
null_resource.master-chef[0] (chef):   Password: true
null_resource.master-chef[0] (chef):   HTTPS: false
null_resource.master-chef[0] (chef):   Insecure: true
null_resource.master-chef[0] (chef):   NTLM: false
null_resource.master-chef[0] (chef):   CACert: false
null_resource.sql-chef[0] (chef): Connecting to remote host via WinRM...
null_resource.sql-chef[0] (chef):   Host: 10.12.235.50
null_resource.sql-chef[0] (chef):   Port: 5985
null_resource.sql-chef[0] (chef):   User: engineering
null_resource.sql-chef[0] (chef):   Password: true
null_resource.sql-chef[0] (chef):   HTTPS: false
null_resource.sql-chef[0] (chef):   Insecure: true
null_resource.sql-chef[0] (chef):   NTLM: false
null_resource.sql-chef[0] (chef):   CACert: false
null_resource.remote-chef[0] (chef): Connecting to remote host via WinRM...
null_resource.remote-chef[0] (chef):   Host: 10.12.233.51
null_resource.remote-chef[0] (chef):   Port: 5985
null_resource.remote-chef[0] (chef):   User: engineering
null_resource.remote-chef[0] (chef):   Password: true
null_resource.remote-chef[0] (chef):   HTTPS: false
null_resource.remote-chef[0] (chef):   Insecure: true
null_resource.remote-chef[0] (chef):   NTLM: false
null_resource.remote-chef[0] (chef):   CACert: false
null_resource.sql-chef[0] (chef): Connected!
null_resource.remote-chef[0] (chef): Connected!
null_resource.master-chef[0] (chef): Connected!
null_resource.remote-chef[0] (chef): Downloading Chef Client...
null_resource.sql-chef[0] (chef): Downloading Chef Client...
null_resource.remote-chef[0] (chef): Installing Chef Client...
null_resource.sql-chef[0] (chef): Installing Chef Client...
null_resource.remote-chef[0]: Still creating... [10s elapsed]
null_resource.master-chef[0]: Still creating... [10s elapsed]
null_resource.sql-chef[0]: Still creating... [10s elapsed]
null_resource.sql-chef[0] (chef): Creating configuration files...
null_resource.remote-chef[0] (chef): Creating configuration files...
null_resource.master-chef[0] (chef): Downloading Chef Client...
null_resource.master-chef[0] (chef): Installing Chef Client...
null_resource.master-chef[0] (chef): Creating configuration files...
null_resource.remote-chef[0]: Still creating... [20s elapsed]
null_resource.master-chef[0]: Still creating... [20s elapsed]
null_resource.sql-chef[0]: Still creating... [20s elapsed]
null_resource.remote-chef[0]: Still creating... [30s elapsed]
null_resource.sql-chef[0]: Still creating... [30s elapsed]
null_resource.master-chef[0]: Still creating... [30s elapsed]
null_resource.remote-chef[0]: Still creating... [40s elapsed]
null_resource.sql-chef[0]: Still creating... [40s elapsed]
null_resource.master-chef[0]: Still creating... [40s elapsed]
null_resource.remote-chef[0]: Still creating... [50s elapsed]
null_resource.sql-chef[0]: Still creating... [50s elapsed]
null_resource.master-chef[0]: Still creating... [50s elapsed]
null_resource.remote-chef[0]: Still creating... [1m0s elapsed]
null_resource.sql-chef[0]: Still creating... [1m0s elapsed]
null_resource.master-chef[0]: Still creating... [1m0s elapsed]
...loops waiting forever...



其他上下文:



在Terraform的github 上对此进行了记录,但没有任何回应。我从那里的评论:

Other context:

I've logged this at Terraform's github, with no response. My comments from there:

我发现,似乎不喜欢一次厨师配置多台机器。到目前为止,我发现以下情况:每4台计算机中的1台将完美配置,而其他计算机则在全部打印正在创建配置文件... 状态后挂起。保持第一个处于活动状态,在下一次运行时,其他三个将再次在同一位置挂起。最后,我调整了代码,仅重新配置了其中一台机器,它运行良好。 要清楚:与先前运行时相同的代码,在单独运行时将完美执行。我认为这是调试此代码的关键线索。

What i've found is that it seems to not like chef-provisioning more than one machine at a time. So far I've found cases where 1 out of 4 machines will provision perfectly, and the others just hang after they all print the creating configuration files... status. Leaving the first one active, on the next run, the other three will all hang again at the same place. Finally, i tweaked the code to only re-provision one of the machines, and it worked perfectly. To be clear: the same exact code that hangs on a prior run, will execute perfectly when run by itself. I think that's a critical clue to debugging this.

要重申:当卡住时,厨师调配始终挂在创建配置文件... 步骤。如果超出该限制,它将始终有效。

To reiterate: When it gets stuck, the chef provisioning always hangs at the creating configuration files... step. If it gets past that, it always works.

以下是使用null_provisioner在两个资源上运行的厨师的要点,两个资源均挂起: https://gist.github.com/mcascone/0b71948f50d52648389e661d00c8e31c

Here is a gist of a chef run using null_provisioner on two resources, both of which hang: https://gist.github.com/mcascone/0b71948f50d52648389e661d00c8e31c

这是成功的一资源运行之一: https://gist.github.com / mcascone / 858855b5bd9d5d1cf655d5e10df67801

And this is one of a successful, 1-resource run: https://gist.github.com/mcascone/858855b5bd9d5d1cf655d5e10df67801

我一直认为这是一个问题,因为同一预配置程序在同一main.tf文件中被多次调用。我会在一次申请运行中致电给厨师供应商3次以上。是供应者的多个实例相互冲突,还是实际上不支持同一供应者的多个运行,而它们都在同一个实例中实例化并相互破坏?

I keep thinking this is an issue with the same provisioner being called multiple times in the same main.tf file. I'm calling the chef provisioner 3+ times in one apply run. Could it be that the multiple instances of the provisioner are colliding with each other, or there isn't actually support for multiple runs of the same provisioner, and they're all getting instantiated in the same instance and corrupting each other?

推荐答案

看起来,至少到目前为止,我们必须降级到v0.11才能运行多个预配置。请查看此线程:使用远程执行配置程序时,instance_count大于2时,Terraform卡住了

It looks like, for now at least, we have to downgrade to v0.11 to get multiple provision runs to work. Please see this thread: Terraform stucks when instance_count is more than 2 while using remote-exec provisioner

这篇关于设置多个资源时,terraform中的Chef设置程序挂起的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆