在Azure VM上使用cloud-init挂载数据磁盘失败 [英] Using cloud-init on an Azure VM to mount a data disk fails
问题描述
This is a similar problem to a previous SO question, from which I adapted my code How can i use cloud-init to load a datadisk on an ubuntu VM in azure
使用通过Terraform传递的云配置文件:
Using a cloud-config file passed through Terraform:
#cloud-config
disk_setup:
/dev/disk/azure/scsi1/lun0:
table_type: gpt
layout: true
overwrite: false
fs_setup:
- device: /dev/disk/azure/scsi1/lun0
partition: 1
filesystem: ext4
mounts:
- [
"/dev/disk/azure/scsi1/lun0-part1",
"/opt/data",
auto,
"defaults,noexec,nofail",
]
data "template_file" "cloudconfig" {
template = file("${path.module}/cloud-init.tpl")
}
data "template_cloudinit_config" "config" {
gzip = true
base64_encode = true
part {
content_type = "text/cloud-config"
content = "${data.template_file.cloudconfig.rendered}"
}
}
module "nexus_test_vm" {
#unnecessary details ommitted - 1 VM with 1 external disk, fixed lun of 0, ubuntu 18.04
vm_size = "Standard_B2S"
cloud_init_template = data.template_cloudinit_config.config.rendered
}
模块的相关位(创建VM)
Relevant bit of the module (VM creation)
resource "azurerm_virtual_machine" "generic-vm" {
count = var.number
name = "${local.my_name}-${count.index}-vm"
location = var.location
resource_group_name = var.resource_group_name
network_interface_ids = [azurerm_network_interface.generic-nic[count.index].id]
vm_size = var.vm_size
delete_os_disk_on_termination = true
storage_image_reference {
id = var.image_id
}
storage_os_disk {
name = "${local.my_name}-${count.index}-os"
caching = "ReadWrite"
create_option = "FromImage"
managed_disk_type = "Standard_LRS"
disk_size_gb = var.os_disk_size
}
os_profile {
computer_name = "${local.my_name}-${count.index}"
admin_username = local.my_admin_user_name
custom_data = var.cloud_init_template
}
os_profile_linux_config {
disable_password_authentication = true
ssh_keys {
path = "/home/${local.my_admin_user_name}/.ssh/authorized_keys"
//key_data = tls_private_key.vm_ssh_key.public_key_openssh
key_data = var.public_key_openssh
}
}
tags = {
Name = "${local.my_name}-${count.index}"
Deployment = local.my_deployment
Prefix = var.prefix
Environment = var.env
Location = var.location
Volatile = var.volatile
Terraform = "true"
}
}
resource "azurerm_managed_disk" "generic-disk" {
name = "${azurerm_virtual_machine.generic-vm.*.name[0]}-1-generic-disk"
location = var.rg_location
resource_group_name = var.rg_name
storage_account_type = "Standard_LRS"
create_option = "Empty"
disk_size_gb = var.external_disk_size
}
resource "azurerm_virtual_machine_data_disk_attachment" "generic-disk" {
managed_disk_id = azurerm_managed_disk.generic-disk.id
virtual_machine_id = azurerm_virtual_machine.generic-vm.*.id[0]
lun = 0
caching = "ReadWrite"
}
我收到许多奇怪的错误,表明当cloud-init运行时该磁盘不存在.但是,当我SSH到VM时,磁盘就在那里了!这是比赛条件吗?我可以在cloud-init中进行配置吗?还是有一些等待可以让我更好地了解可能发生的情况?
I am getting a lot of weird errors indicating that the disk does not exist when cloud-init is running. However, when I ssh into the VM, the disk is right there! Is this a race condition? Is there a wait I can configure in cloud-init or something to give me a better picture of what might be happening?
来自VM的相关日志:
2020-04-07 16:30:51,296 - cc_disk_setup.py[DEBUG]: Partitioning disks: {'/dev/disk/azure/scsi1/lun0': {'layout': True, 'overwrite': False, 'table_type': 'gpt'}, '/dev/disk/cloud/azure_resource': {'table_type': 'gpt', 'layout': [100], 'overwrite': True, '_origname': 'ephemeral0'}}
2020-04-07 16:30:51,318 - util.py[DEBUG]: Creating partition on /dev/disk/azure/scsi1/lun0 took 0.021 seconds
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
RuntimeError: Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
2020-04-07 16:30:51,601 - cc_disk_setup.py[DEBUG]: setting up filesystems: [{'device': '/dev/disk/azure/scsi1/lun0', 'filesystem': 'ext4', 'partition': 1}]
2020-04-07 16:30:51,725 - util.py[DEBUG]: Creating fs for /dev/disk/azure/scsi1/lun0 took 0.124 seconds
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
RuntimeError: Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
2020-04-07 16:30:51,733 - cc_mounts.py[DEBUG]: mounts configuration is [['/dev/disk/azure/scsi1/lun0-part1', '/opt/data', 'auto', 'defaults,noexec,nofail']]
2020-04-07 16:30:51,734 - cc_mounts.py[DEBUG]: Attempting to determine the real name of /dev/disk/azure/scsi1/lun0-part1
2020-04-07 16:30:51,734 - cc_mounts.py[DEBUG]: changed /dev/disk/azure/scsi1/lun0-part1 => None
2020-04-07 16:30:51,734 - cc_mounts.py[DEBUG]: Ignoring nonexistent named mount /dev/disk/azure/scsi1/lun0-part1
2020-04-07 16:30:51,736 - cc_mounts.py[DEBUG]: Changes to fstab: ['+ /dev/disk/azure/scsi1/lun0-part1 /opt/data auto defaults,noexec,nofail,comment=cloudconfig 0 2']
ls -l/dev/disk/azure/scsi1/lun0
lrwxrwxrwx 1 root root 12 Apr 7 16:32 /dev/disk/azure/scsi1/lun0 -> ../../../sdc
推荐答案
对于此问题,我认为这是有关数据磁盘,VM和cloud-init的顺序.据我所知,cloud-init是在VM首次启动时执行的.而且您创建的Terraform文件似乎创建的数据磁盘可能晚于VM,因此它也晚于cloud-init,然后导致错误.
For this issue, I think it's the sequence about the data disk and the VM and the cloud-init. As I know, the cloud-init is executed when the VM first boot. And the Terraform file you created seems that the data disk may be created later than the VM, so it also is later than then cloud-init and then it caused the error.
因此,解决方案是使用storage_data_disk
块将VM内的数据磁盘设置为可以在连接了数据磁盘的情况下创建VM,然后执行cloud-init.
So the solution is that set the data disk inside the VM with the storage_data_disk
block so that the VM will be created with the data disk attached and then execute the cloud-init.
这篇关于在Azure VM上使用cloud-init挂载数据磁盘失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!