"2015_12_06_Visé_135942" by "Norbert Schnitzler" on Flickr

Idea for Reusable “Custom Data” templates across multiple modules with Terraform

A few posts ago I wrote about building Windows virtual machines with Terraform, and a couple of days ago, “YoureInHell” on Twitter reached out and asked what advice I’d give about having several different terraform modules use the same basic build of custom data.

They’re trying to avoid putting the same template file into several repos (I suspect so that one team can manage the “custom-data”, “user-data” or “cloud-init” files, and another can manage the deployment terraform files), and asked if I had any suggestions.

I had three ideas.

Using a New Module

This was my initial thought; create a new module called something like “Standard Build File”, and this build file contains just the following terraform file, and a template file called “build.tmpl”.

variable "someKey" {
  default = "someVar"
}

variable "hostName" {
  default = "hostName"
}

variable "unsetVar" {}

output "template" {
  value = templatefile("build.tmpl",
    {
      someKey  = var.someKey
      hostName = var.hostName
      unsetVar = var.unsetVar
    }
  )
}

Now, in your calling module, you can do:

module "buildTemplate" {
  source   = "git::https://git.example.net/buildTemplate.git?ref=latestLive"
  # See https://www.terraform.io/docs/language/modules/sources.html
  #   for more details on how to specify the source of this module
  unsetVar = "Set To This String"
}

output "RenderedTemplate" {
  value = module.buildTemplate.template
}

And that means that you can use the module.buildTemplate.template anywhere you’d normally specify your templateFile, and get a consistent, yet customizable template (and note, because I specified a particular tag, you can use that to move to the “current latest” or “the version we released into live on YYYY-MM-DD” by using a tag, or a commit ref.)

Now, the downside to this is that you’ve now got a whole separate module for creating your instances that needs to be maintained. What are our other options?

Git Submodules for your template

I use Git Submodules a LOT for my code. It’s a bit easy to get into a state with them, particularly if you’re not great at keeping on top of them, but… if you are OK with them, you’d create a repo, again, let’s use “https://git.example.net/buildTemplate.git” as our git repo, and put your template in there. In your terraform git repo, you’d run this command: git submodule add https://git.example.net/buildTemplate.git and this would add a directory to your repo called “buildTemplate” that you can use your templatefile function in Terraform against (like this: templatefile("buildTemplate/build.tmpl", {someVar="var"})).

Now, this means that you’ve effectively got two git repos in one tree, and if any changes occur in your submodule repo, you’d need to do git checkout main ; git pull to get the latest updates from your main branch, and when you check it out initially on another machine, you’ll need to do git clone https://git.example.net/terraform --recurse-submodules to get the submodules populated at the same time.

A benefit to this is that because it’s “inline” with the rest of your tree, if you need to make any changes to this template, it’s clearly where it’s supposed to be in your tree, you just need to remember about the submodule when it comes to making PRs and suchforth.

How about that third idea?

Keep it simple, stupid 😁

Why bother with submodules, or modules from a git repo? Terraform can be quite easy to over complicate… so why not create all your terraform files in something like this structure:

project\build.tmpl
project\web_servers\main.tf
project\logic_servers\main.tf
project\database_servers\main.tf

And then in each of your terraform files (web_servers, logic_servers and database_servers) just reference the file in your project root, like this: templatefile("../build.tmpl", {someVar="var"})

The downside to this is that you can’t as easily farm off the control of that build script to another team, and they’d be making (change|pull|merge) requests against the same repo as you… but then again, isn’t that the idea for functional teams? 😃

Featured image is “2015_12_06_Visé_135942” by “Norbert Schnitzler” on Flickr and is released under a CC-BY-SA license.

"Field Notes - Sweet Tooth" by "The Marmot" on Flickr

Multi-OS builds in AWS with Terraform – some notes from the field!

Late edit: 2020-05-22 – Updated with better search criteria from colleague conversations

I’m building a proof of concept for … well, a product that needs testing on several different Linux and Windows variants on AWS and Azure. I’m building this environment with Terraform, and it’s thrown me a few curve balls, so I thought I’d document the issues I’ve had!

The versions of distributions I have tested are the latest releases of each of these images at-or-near the time of writing. The major version listed is the earliest I have tested, so no assumption is made about previous versions, and later versions, after the time of this post should not assume any of this data is also accurate!

(Fujitsu Staff – please contact me on my work email address for details on how to get the internal AMIs of our builds of these images 😄)

Linux Distributions

On the whole, I tend to be much more confident and knowledgable about Linux distributions. I’ve also done far more installs of each of these!

Almost all of these installs are Free of Charge, with the exception of Red Hat Enterprise Linux, which requires a subscription fee, and this can be “Pay As You Go” or “Bring Your Own License”. These sorts of things are arranged for me, so I don’t know how easy or hard it is to organise these licenses!

These builds all use cloud-init, via either a cloud-init yaml script, or some shell scripting language (usually accepted to be bash). If this script fails to execute, you will find your user-data file in /var/lib/cloud/instance/scripts/part-001. If this is a shell script then you will be able to execute it by running that script as your root user.

Amazon Linux 2 or Amzn2

Amazon Linux2 is the “preferred” distribution for Amazon Web Services (AWS) (surprisingly enough). It is based on Red Hat Enterprise Linux (RHEL), and many of the instructions you’ll want to run to install software will use RHEL based instructions. This platform is not available outside the AWS ecosystem, as far as I can tell, although you might be able to run it on-prem.

Software packages are limited in this distribution, so any “extra” features require the installation of the “EPEL” repository, by executing the command sudo amazon-linux-extras install epel and then using the yum command to install further packages. I needed nginx for part of my build, and this was only in EPEL.

Amzn2 AMI Lookup

data "aws_ami" "amzn2" {
  most_recent = true

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-2.0.*-gp2"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["amazon"] # Canonical
}

Amzn2 User Account

Amazon Linux 2 images under AWS have a default “ec2-user” user account. sudo will allow escalation to Root with no password prompt.

Amzn2 AWS Interface Configuration

The primary interface is called eth0. Network Manager is not installed. To manage the interface, you need to edit /etc/sysconfig/network-scripts/ifcfg-eth0 and apply changes with ifdown eth0 ; ifup eth0.

Amzn2 user-data / Cloud-Init Troubleshooting

I’ve found the output from user-data scripts appearing in /var/log/cloud-init-output.log.

CentOS 7

For starters, AWS doesn’t have an official CentOS8 image, so I’m a bit stymied there! In fact, as far as I can make out, CentOS is only releasing ISOs for builds now, and not any cloud images. There’s an open issue on their bug tracker which seems to suggest that it’s not going to get any priority any time soon! Blimey.

This image may require you to “subscribe” to the image (particularly if you have a “private marketplace”), but this will be requested of you (via a URL provided on screen) when you provision your first machine with this AMI.

Like with Amzn2, CentOS7 does not have nginx installed, and like Amzn2, installation of the EPEL library is not a difficult task. CentOS7 bundles a file to install the EPEL, installed by running yum install epel-release. After this is installed, you have the “full” range of software in EPEL available to you.

CentOS AMI Lookup

data "aws_ami" "centos7" {
  most_recent = true

  filter {
    name   = "name"
    values = ["CentOS Linux 7*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["aws-marketplace"]
}

CentOS User Account

CentOS7 images under AWS have a default “centos” user account. sudo will allow escalation to Root with no password prompt.

CentOS AWS Interface Configuration

The primary interface is called eth0. Network Manager is not installed. To manage the interface, you need to edit /etc/sysconfig/network-scripts/ifcfg-eth0 and apply changes with ifdown eth0 ; ifup eth0.

CentOS Cloud-Init Troubleshooting

I’ve run several different user-data located bash scripts against this system, and the logs from these scripts are appearing in the default syslog file (/var/log/syslog) or by running journalctl -xefu cloud-init. They do not appear in /var/log/cloud-init-output.log.

Red Hat Enterprise Linux (RHEL) 7 and 8

Red Hat has both RHEL7 and RHEL8 images in the AWS market place. The Proof Of Value (POV) I was building was only looking at RHEL7, so I didn’t extensively test RHEL8.

Like Amzn2 and CentOS7, RHEL7 needs EPEL installing to have additional packages installed. Unlike Amzn2 and CentOS7, you need to obtain the EPEL package from the Fedora Project. Do this by executing these two commands:

wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum install epel-release-latest-7.noarch.rpm

After this is installed, you’ll have access to the broader range of software that you’re likely to require. Again, I needed nginx, and this was not available to me with the stock install.

RHEL7 AMI Lookup

data "aws_ami" "rhel7" {
  most_recent = true

  filter {
    name   = "name"
    values = ["RHEL-7*GA*Hourly*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["309956199498"] # Red Hat
}

RHEL8 AMI Lookup

data "aws_ami" "rhel8" {
  most_recent = true

  filter {
    name   = "name"
    values = ["RHEL-8*HVM-*Hourly*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["309956199498"] # Red Hat
}

RHEL User Accounts

RHEL7 and RHEL8 images under AWS have a default “ec2-user” user account. sudo will allow escalation to Root with no password prompt.

RHEL AWS Interface Configuration

The primary interface is called eth0. Network Manager is installed, and the eth0 interface has a profile called “System eth0” associated to it.

RHEL Cloud-Init Troubleshooting

In RHEL7, as per CentOS7, logs from user-data scripts are appear in the general syslog file (in this case, /var/log/messages) or by running journalctl -xefu cloud-init. They do not appear in /var/log/cloud-init-output.log.

In RHEL8, logs from user-data scrips now appear in /var/log/cloud-init-output.log.

Ubuntu 18.04

At the time of writing this, the vendor, who’s product I was testing, categorically stated that the newest Ubuntu LTS, Ubuntu 20.04 (Focal Fossa) would not be supported until some time after our testing was complete. As such, I spent no time at all researching or planning to use this image.

Ubuntu is the only non-RPM based distribution in this test, instead being based on the Debian project’s DEB packages. As such, it’s range of packages is much wider. That said, for the project I was working on, I required a later version of nginx than was available in the Ubuntu Repositories, so I had to use the nginx Personal Package Archive (PPA). To do this, I found the official PPA for the nginx project, and followed the instructions there. Generally speaking, this would potentially risk any support from the distribution vendor, as it’s not certified or supported by the project… but I needed that version, so I had to do it!

Ubuntu 18.04 AMI Lookup

data "aws_ami" "ubuntu1804" {
  most_recent = true

  filter {
    name   = "name"
    values = ["*ubuntu*18.04*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["099720109477"] # Canonical
}

Ubuntu 18.04 User Accounts

Ubuntu 18.04 images under AWS have a default “ubuntu” user account. sudo will allow escalation to Root with no password prompt.

Ubuntu 18.04 AWS Interface Configuration

The primary interface is called eth0. Network Manager is not installed, and instead Ubuntu uses Netplan to manage interfaces. The file to manage the interface defaults is /etc/netplan/50-cloud-init.yaml. If you struggle with this method, you may wish to install ifupdown and define your configuration in /etc/network/interfaces.

Ubuntu 18.04 Cloud-Init Troubleshooting

In Ubuntu 18.04, logs from user-data scrips appear in /var/log/cloud-init-output.log.

Windows

This section is far more likely to have it’s data consolidated here!

Windows has a common “standard” username – Administrator, and a common way of creating a password (this is generated on-boot, and the password is transferred to the AWS Metadata Service, which it is retrieved and decrypted with the SSH key you’ve used to build the “authentication” to the box) which Terraform handles quite nicely.

The network device is referred to as “AWS PV Network Device #0”. It can be managed with powershell, netsh (although apparently Microsoft are rumbling about demising this script), or from the GUI.

Windows 2012R2

This version is very old now, and should be compared to Windows 7 in terms of age. It is only supported by Microsoft with an extended maintenance package!

Windows 2012R2 AMI Lookup

data "aws_ami" "w2012r2" {
  most_recent = true

  filter {
    name = "name"
    values = ["Windows_Server-2012-R2_RTM-English-64Bit-Base*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["801119661308"] # AWS
}

Windows 2012R2 Cloud-Init Troubleshooting

Logs from the Metadata Service can be found in C:\Program Files\Amazon\Ec2ConfigService\Logs\Ec2ConfigLog.txt. You can also find the userdata script in C:\Program Files\Amazon\Ec2ConfigService\Scripts\UserScript.ps1. This can be launched and debugged using PowerShell ISE, which is in the “Start” menu.

Windows 2016

This version is reasonably old now, and should be compared to Windows 8 in terms of age. It is supported until 2022 in “mainline” support.

Windows 2016 AMI Lookup

data "aws_ami" "w2016" {
  most_recent = true

  filter {
    name = "name"
    values = ["Windows_Server-2016-English-Full-Base*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["801119661308"] # AWS
}

Windows 2016 Cloud-Init Troubleshooting

The metadata service has moved from Windows 2016 and onwards. Logs are stored in a partially hidden directory tree, so you may need to click in the “Address” bar of the Explorer window and type in part of this path. The path to these files is: C:\ProgramData\Amazon\EC2-Windows\Launch\Log. I say “files” as there are two parts to this file – an “Ec2Launch.log” file which reports on the boot process, and “UserdataExecution.log” which shows the output from the userdata script.

Unlike with the Windows 2012R2 version, you can’t get hold of the actual userdata script on the filesystem, you need to browse to a special path in the metadata service (actually, technically, you can do this with any of the metadata services – OpenStack, Azure, and so on) which is: http://169.254.169.254/latest/user-data/

This will contain userdata between a <powershell> and </powershell> pair of tags. This would need to be copied out of this URL and pasted into a new file on your local machine to determine why issues are occurring. Again, I would recommend using PowerShell ISE from the Start Menu to debug your code.

Windows 2019

This version is the most recent released version of Windows Server, and should be compared to Windows 10 in terms of age.

Windows 2019 AMI Lookup

data "aws_ami" "w2019" {
  most_recent = true

  filter {
    name = "name"
    values = ["Windows_Server-2019-English-Full-Base*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["801119661308"] # AWS
}

Windows 2019 Cloud-Init Troubleshooting

Functionally, the same as Windows 2016, but to recap, the metadata service has moved from Windows 2016 and onwards. Logs are stored in a partially hidden directory tree, so you may need to click in the “Address” bar of the Explorer window and type in part of this path. The path to these files is: C:\ProgramData\Amazon\EC2-Windows\Launch\Log. I say “files” as there are two parts to this file – an “Ec2Launch.log” file which reports on the boot process, and “UserdataExecution.log” which shows the output from the userdata script.

Unlike with the Windows 2012R2 version, you can’t get hold of the actual userdata script on the filesystem, you need to browse to a special path in the metadata service (actually, technically, you can do this with any of the metadata services – OpenStack, Azure, and so on) which is: http://169.254.169.254/latest/user-data/

This will contain userdata between a <powershell> and </powershell> pair of tags. This would need to be copied out of this URL and pasted into a new file on your local machine to determine why issues are occurring. Again, I would recommend using PowerShell ISE from the Start Menu to debug your code.

Featured image is “Field Notes – Sweet Tooth” by “The Marmot” on Flickr and is released under a CC-BY license.

“New shoes” by “Morgaine” from Flickr

Making Windows Cloud-Init Scripts run after a reboot (Using Terraform)

I’m currently building a Proof Of Value (POV) environment for a product, and one of the things I needed in my environment was an Active Directory domain.

To do this in AWS, I had to do the following steps:

  1. Build my Domain Controller
    1. Install Windows
    2. Set the hostname (Reboot)
    3. Promote the machine to being a Domain Controller (Reboot)
    4. Create a domain user
  2. Build my Member Server
    1. Install Windows
    2. Set the hostname (Reboot)
    3. Set the DNS client to point to the Domain Controller
    4. Join the server to the domain (Reboot)

To make this work, I had to find a way to trigger build steps after each reboot. I was working with Windows 2012R2, Windows 2016 and Windows 2019, so the solution had to be cross-version. Fortunately I found this script online! That version was great for Windows 2012R2, but didn’t cover Windows 2016 or later… So let’s break down what I’ve done!

In your userdata field, you need to have two sets of XML strings, as follows:

<persist>true</persist>
<powershell>
$some = "powershell code"
</powershell>

The first block says to Windows 2016+ “keep trying to run this script on each boot” (note that you need to stop it from doing non-relevant stuff on each boot – we’ll get to that in a second!), and the second bit is the PowerShell commands you want it to run. The rest of this now will focus just on the PowerShell block.

  $path= 'HKLM:\Software\UserData'
  
  if(!(Get-Item $Path -ErrorAction SilentlyContinue)) {
    New-Item $Path
    New-ItemProperty -Path $Path -Name RunCount -Value 0 -PropertyType dword
  }
  
  $runCount = Get-ItemProperty -Path $path -Name Runcount -ErrorAction SilentlyContinue | Select-Object -ExpandProperty RunCount
  
  if($runCount -ge 0) {
    switch($runCount) {
      0 {
        $runCount = 1 + [int]$runCount
        Set-ItemProperty -Path $Path -Name RunCount -Value $runCount
        if ($ver -match 2012) {
          #Enable user data
          $EC2SettingsFile = "$env:ProgramFiles\Amazon\Ec2ConfigService\Settings\Config.xml"
          $xml = [xml](Get-Content $EC2SettingsFile)
          $xmlElement = $xml.get_DocumentElement()
          $xmlElementToModify = $xmlElement.Plugins
          
          foreach ($element in $xmlElementToModify.Plugin)
          {
            if ($element.name -eq "Ec2HandleUserData") {
              $element.State="Enabled"
            }
          }
          $xml.Save($EC2SettingsFile)
        }
        $some = "PowerShell Script"
      }
    }
  }

Whew, what a block! Well, again, we can split this up into a couple of bits.

In the first few lines, we build a pointer, a note which says “We got up to here on our previous boots”. We then read that into a variable and find that number and execute any steps in the block with that number. That’s this block:

  $path= 'HKLM:\Software\UserData'
  
  if(!(Get-Item $Path -ErrorAction SilentlyContinue)) {
    New-Item $Path
    New-ItemProperty -Path $Path -Name RunCount -Value 0 -PropertyType dword
  }
  
  $runCount = Get-ItemProperty -Path $path -Name Runcount -ErrorAction SilentlyContinue | Select-Object -ExpandProperty RunCount
  
  if($runCount -ge 0) {
    switch($runCount) {

    }
  }

The next part (and you’ll repeat it for each “number” of reboot steps you need to perform) says “increment the number” then “If this is Windows 2012, remind the userdata handler that the script needs to be run again next boot”. That’s this block:

      0 {
        $runCount = 1 + [int]$runCount
        Set-ItemProperty -Path $Path -Name RunCount -Value $runCount
        if ($ver -match 2012) {
          #Enable user data
          $EC2SettingsFile = "$env:ProgramFiles\Amazon\Ec2ConfigService\Settings\Config.xml"
          $xml = [xml](Get-Content $EC2SettingsFile)
          $xmlElement = $xml.get_DocumentElement()
          $xmlElementToModify = $xmlElement.Plugins
          
          foreach ($element in $xmlElementToModify.Plugin)
          {
            if ($element.name -eq "Ec2HandleUserData") {
              $element.State="Enabled"
            }
          }
          $xml.Save($EC2SettingsFile)
        }
        
      }

In fact, it’s fair to say that in my userdata script, this looks like this:

  $path= 'HKLM:\Software\UserData'
  
  if(!(Get-Item $Path -ErrorAction SilentlyContinue)) {
    New-Item $Path
    New-ItemProperty -Path $Path -Name RunCount -Value 0 -PropertyType dword
  }
  
  $runCount = Get-ItemProperty -Path $path -Name Runcount -ErrorAction SilentlyContinue | Select-Object -ExpandProperty RunCount
  
  if($runCount -ge 0) {
    switch($runCount) {
      0 {
        ${file("templates/step.tmpl")}

        ${templatefile(
          "templates/rename_windows.tmpl",
          {
            hostname = "SomeMachine"
          }
        )}
      }
      1 {
        ${file("templates/step.tmpl")}

        ${templatefile(
          "templates/join_ad.tmpl",
          {
            dns_ipv4 = "192.0.2.1",
            domain_suffix = "ad.mycorp",
            join_account = "ad\someuser",
            join_password = "SomePassw0rd!"
          }
        )}
      }
    }
  }

Then, after each reboot, you need a new block. I have a block to change the computer name, a block to join the machine to the domain, and a block to install an software that I need.

Featured image is “New shoes” by “Morgaine” on Flickr and is released under a CC-BY-SA license.

"vieux port Marseille" by "Jeanne Menjoulet" on Flickr

Networking tricks with Multipass in Virtualbox on Windows (Bridged interfaces and Port Forwards)

TL;DR? Want to “just” bridge one or more interfaces to a Multipass instance when you’re using Virtualbox? See the Bridging Summary below. Want to do a port forward? See the Port Forward section below. You will need the psexec command and to execute this as an administrator. The use of these two may be considered a security incident on your computing environment, depending on how your security processes and infrastructure are defined and configured.

Ah Multipass. This is a tool created by Canonical to create a “A mini-cloud on your Mac or Windows workstation.” (from their website)…

I’ve often seen this endorsed as the tool of choice from Canonical employees to do “stuff” like run Kubernetes, develop tools for UBPorts (previously Ubuntu Touch) devices, and so on.

So far, it seems interesting. It’s a little bit like Vagrant with an in-built cloud-init Provisioner, and as I want to test out the cloud-init files I’m creating for AWS and Azure, that’d be so much easier than actually building the AWS or Azure machines, or finding a viable cloud-init plugin for Vagrant to test it out.

BUT… Multipass is really designed for Linux systems (running LibVirt), OS X (running HyperKit) and Windows (running Hyper-V). Even if I were using Windows 10 Pro on this machine, I use Virtualbox for “things” on my Windows Machine, and Hyper-V steals the VT-X bit, which means that VirtualBox can’t run x64 code…. Soooo I can’t use the Hyper-V mode.

Now, there is a “fix” for this. You can put Multipass into Virtualbox mode, which lets you run Multipass on Windows or OS X without using their designed-for hypervisor, but this has a downside, you see, VirtualBox doesn’t give MultiPass the same interface to route networking connections to the VM, and there’s currently no CLI or GUI options to say “bridge my network” or “forward a port” (in part because it needs to be portable to the native hypervisor options, apparently). So, I needed to fudge some things so I can get my beloved bridged connections.

I got to the point where I could do this, thanks to the responses to a few issues I raised on the Multipass Github issues, mostly #1333.

The first thing you need to install in Windows is PsExec, because Multipass runs it’s Virtual Machines as the SYSTEM account, and talking to SYSTEM account processes is nominally hard. Get PsExec from the SysInternals website. Some IT Security professionals will note the addition of PsExec as a potential security incident, but then again, they might also see the running of a virtual machine as a security incident too, as these aren’t controlled with a central image. Anyway… Just bear it in mind, and don’t shout at me if you get frogmarched in front of your CISO.

I’m guessing if you’re here, you’ve already installed Multipass, (but if not, and it seems interesting – it’s over at https://multipass.run. Get it and install it, then carry on…) and you’ve probably enabled the VirtualBox mode (if not – open a command prompt as administrator, and run “multipass set local.driver=virtualbox“). Now, you can start sorting out your bridges.

Sorting out bridges

First things first, you need to launch a virtual machine. I did, and it generated a name for my image.

C:\Users\JON>multipass launch
Launched: witty-kelpie

Fab! We have a running virtual machine, and you should be able to get a shell in there by running multipass shell "witty-kelpie" (the name of the machine it launched before). But, uh-oh. We have the “default” NAT interface of this device mapped, not a bridged interface.

C:\Users\JON>multipass shell "witty-kelpie"
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-76-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

  System information as of Thu Feb  6 10:56:38 GMT 2020

  System load:  0.3               Processes:             82
  Usage of /:   20.9% of 4.67GB   Users logged in:       0
  Memory usage: 11%               IP address for enp0s3: 10.0.2.15
  Swap usage:   0%


0 packages can be updated.
0 updates are security updates.


To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

ubuntu@witty-kelpie:~$

So, exit the machine, and issue a multipass stop "witty-kelpie" command to ask Virtualbox to shut it down.

So, this is where the fun[1] part begins.
[1] The “Fun” part here depends on how you view this specific set of circumstances 😉

We need to get the descriptions of all the interfaces we might want to bridge to this device. I have three interfaces on my machine – a WiFi interface, a Ethernet interface on my laptop, and an Ethernet interface on my USB3 dock. At some point in the past, I renamed these interfaces, so I’d recognise them in the list of interfaces, so they’re not just called “Connection #1”, “Connection #2” and so on… but you should recognise your interfaces.

To get this list of interfaces, open PowerShell (as a “user”), and run this command:

PS C:\Users\JON> Get-NetAdapter -Physical | format-list -property "Name","DriverDescription"

Name              : On-Board Network Connection
DriverDescription : Intel(R) Ethernet Connection I219-LM

Name              : Wi-Fi
DriverDescription : Intel(R) Dual Band Wireless-AC 8260

Name              : Dock Network Connection
DriverDescription : DisplayLink Network Adapter NCM

For reasons best known to the Oracle team, they use the “Driver Description” to identify the interfaces, not the name assigned to the device by the user, so, before we get started, find your interface, and note down the description for later. If you want to bridge “all” of them, make a note of all the interfaces in question, and in the order you want to attach them. Note that Virtualbox doesn’t really like exposing more than 8 NICs without changing the Chipset to ICH9 (but really… 9+ NICs? really??) and the first one is already consumed with the NAT interface you’re using to connect to it… so that gives you 7 bridgeable interfaces. Whee!

So, now you know what interfaces you want to bridge, let’s configure the Virtualbox side. Like I said before you need psexec. I’ve got psexec stored in my Downloads folder. You can only run psexec as administrator, so open up an Administrator command prompt or powershell session, and run your command.

Just for clarity, your commands are likely to have some different paths, so remember that wherever “your” PsExec64.exe command is located, mine is in C:\Users\JON\Downloads\sysinternals\PsExec64.exe, and wherever your vboxmanage.exe is located, mine is in C:\Program Files\Oracle\VirtualBox\vboxmanage.exe.

Here, I’m going to attach my dock port (“DisplayLink Network Adapter NCM”) to the second VirtualBox interface, the Wifi adaptor to the third interface and my locally connected interface to the fourth interface. Your interfaces WILL have different descriptions, and you’re likely not to need quite so many of them!

C:\WINDOWS\system32>C:\Users\JON\Downloads\sysinternals\PsExec64.exe -s "c:\program files\oracle\virtualbox\vboxmanage" modifyvm "witty-kelpie" --nic2 bridged --bridgeadapter2 "DisplayLink Network Adapter NCM" --nic3 bridged --bridgeadapter3 "Intel(R) Dual Band Wireless-AC 8260" --nic4 bridged --bridgeadapter4 "Intel(R) Ethernet Connection I219-LM"

PsExec v2.2 - Execute processes remotely
Copyright (C) 2001-2016 Mark Russinovich
Sysinternals - www.sysinternals.com

c:\program files\oracle\virtualbox\vboxmanage exited on MINILITH with error code 0.

An error code of 0 means that it completed successfuly and with no issues.

If you wanted to use a “Host Only” network (if you’re used to using Vagrant, you might know it as “Private” Networking), then change the NIC you’re interested in from --nicX bridged --bridgeadapterX "Some Description" to --nicX hostonly --hostonlyadapterX "VirtualBox Host-Only Ethernet Adapter" (where X is replaced with the NIC number you want to swap, ranged between 2 and 8, as 1 is the NAT interface you use to SSH into the virtual machine.)

Now we need to check to make sure the machine has it’s requisite number of interfaces. We use the showvminfo flag to the vboxmanage command. It produces a LOT of content, so I’ve manually filtered the lines I want, but you should spot it reasonably quickly.

C:\WINDOWS\system32>C:\Users\JON\Downloads\sysinternals\PsExec64.exe -s "c:\program files\oracle\virtualbox\vboxmanage" showvminfo "witty-kelpie"

PsExec v2.2 - Execute processes remotely
Copyright (C) 2001-2016 Mark Russinovich
Sysinternals - www.sysinternals.com


Name:                        witty-kelpie
Groups:                      /Multipass
Guest OS:                    Ubuntu (64-bit)
<SNIP SOME CONTENT>
NIC 1:                       MAC: 0800273CCED0, Attachment: NAT, Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 1 Settings:  MTU: 0, Socket (send: 64, receive: 64), TCP Window (send:64, receive: 64)
NIC 1 Rule(0):   name = ssh, protocol = tcp, host ip = , host port = 53507, guest ip = , guest port = 22
NIC 2:                       MAC: 080027303758, Attachment: Bridged Interface 'DisplayLink Network Adapter NCM', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 3:                       MAC: 0800276EA174, Attachment: Bridged Interface 'Intel(R) Dual Band Wireless-AC 8260', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 4:                       MAC: 080027042135, Attachment: Bridged Interface 'Intel(R) Ethernet Connection I219-LM', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 5:                       disabled
NIC 6:                       disabled
NIC 7:                       disabled
NIC 8:                       disabled
<SNIP SOME CONTENT>

Configured memory balloon size: 0MB

c:\program files\oracle\virtualbox\vboxmanage exited on MINILITH with error code 0.

Fab! We now have working interfaces… But wait, let’s start that VM back up and see what happens.

C:\Users\JON>multipass shell "witty-kelpie"
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-76-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

  System information as of Thu Feb  6 11:31:08 GMT 2020

  System load:  0.1               Processes:             84
  Usage of /:   21.1% of 4.67GB   Users logged in:       0
  Memory usage: 11%               IP address for enp0s3: 10.0.2.15
  Swap usage:   0%


0 packages can be updated.
0 updates are security updates.


Last login: Thu Feb  6 10:56:45 2020 from 10.0.2.2
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

ubuntu@witty-kelpie:~$

Wait, what….. We’ve still only got the one interface up with an IP address… OK, let’s fix this!

As of Ubuntu 18.04, interfaces are managed using Netplan, and, well, when the VM was built, it didn’t know about any interface past the first one, so we need to get Netplan to get them enabled. Let’s check they’re detected by the VM, and see what they’re all called:

ubuntu@witty-kelpie:~$ ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 08:00:27:3c:ce:d0 brd ff:ff:ff:ff:ff:ff
3: enp0s8: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 08:00:27:30:37:58 brd ff:ff:ff:ff:ff:ff
4: enp0s9: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 08:00:27:6e:a1:74 brd ff:ff:ff:ff:ff:ff
5: enp0s10: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 08:00:27:04:21:35 brd ff:ff:ff:ff:ff:ff
ubuntu@witty-kelpie:~$ 

If you compare the link/ether lines to the output from showvminfo we executed before, you’ll see that the MAC address against enp0s3 matches the NAT interface, while enp0s8 matches the DisplayLink adapter, and so on… So we basically want to ask NetPlan to do a DHCP lookup for all the new interfaces we’ve added to it. If you’ve got 1 NAT and 7 physical interfaces (why oh why…) then you’d have enp0s8, 9, 10, 16, 17, 18 and 19 (I’ll come back to the random numbering in a tic)… so we now need to ask Netplan to do DHCP on all of those interfaces (assuming we’ll be asking for them all to come up!)

If we want to push that in, then we need to add a new file in /etc/netplan called something like 60-extra-interfaces.yaml, that should contain:

network:
  ethernets:
    enp0s8:
      optional: yes
      dhcp4: yes
      dhcp4-overrides:
        route-metric: 10
    enp0s9:
      optional: yes
      dhcp4: yes
      dhcp4-overrides:
        route-metric: 11
    enp0s10:
      optional: yes
      dhcp4: yes
      dhcp4-overrides:
        route-metric: 12
    enp0s16:
      optional: yes
      dhcp4: yes
      dhcp4-overrides:
        route-metric: 13
    enp0s17:
      optional: yes
      dhcp4: yes
      dhcp4-overrides:
        route-metric: 14
    enp0s18:
      optional: yes
      dhcp4: yes
      dhcp4-overrides:
        route-metric: 15
    enp0s19:
      optional: yes
      dhcp4: yes
      dhcp4-overrides:
        route-metric: 16

Going through this, we basically ask netplan not to assume the interfaces are attached. This stops the boot process for waiting for a timeout to configure each of the interfaces before proceeding, so it means your boot should be reasonably fast, particularly if you don’t always attach a network cable or join a Wifi network on all your interfaces!

We also say to assume we want IPv4 DHCP on each of those interfaces. I’ve done IPv4 only, as most people don’t use IPv6 at home, but if you are doing IPv6 as well, then you’d also need the same lines that start dhcp4 copied to show dhcp6 (like dhcp6: yes and dhcp6-overrides: route-metric: 10)

The eagle eyed of you might notice that the route metric increases for each extra interface. This is because realistically, if you have two interfaces connected (perhaps if you’ve got wifi enabled, and plug a network cable in), then you’re more likely to want to prioritize traffic going over the lower numbered interfaces than the higher number interfaces.

Once you’ve created this file, you need to run netplan apply or reboot your machine.

So, yehr, that gets you sorted on the interface front.

Bridging Summary

To review, you launch your machine with multipass launch, and immediately stop it with multipass stop "vm-name", then, as an admin, run psexec vboxmanage modifyvm "vm-name" --nic2 bridged --bridgedadapter2 "NIC description", and then start the machine with multipass start "vm-name". Lastly, ask the interface to do DHCP by manipulating your Netplan configuration.

Interface Names in VirtualBox

Just a quick note on the fact that the interface names aren’t called things like eth0 any more. A few years back, Ubuntu (amongst pretty much all of the Linux distribution vendors) changed from using eth0 style naming to what they call “Predictable Network Interface Names”. This derives the names from things like, what the BIOS provides for on-board interfaces, slot index numbers for PCI Express ports, and for this case, the “geographic location of the connector”. In Virtualbox, these interfaces are provided as the “Geographically” attached to “port 0” (so enp0 are all on port 0), but for some reason, they broadcast themselves as being attached to the port 0 at “slots” 3, 8, 9, 10, 16, 17, 18 and 19… hence enp0s3 and so on. shrug It just means that if you don’t have the interfaces coming up on the interfaces you’re expecting, you need to run ip link to confirm the MAC addresses match.

Port Forwarding

Unlike with the Bridging, we don’t need to power down the VM to add the extra interfaces, we just need to use psexec (as an admin again) to execute a vboxmanage command – in this case, it’s:

C:\WINDOWS\system32>C:\Users\JON\Downloads\sysinternals\PsExec64.exe -s "c:\program files\oracle\virtualbox\vboxmanage" controlvm "witty-kelpie" --natpf1 "myport,tcp,,1234,,2345"

OK, that’s a bit more obscure. Basically it says “Create a NAT rule on NIC 1 called ‘myport’ to forward TCP connections from port 1234 attached to any IP associated to the host OS to port 2345 attached to the DHCP supplied IP on the guest OS”.

If we wanted to run a DNS server in our VM, we could run multiple NAT rules in the same command, like this:

C:\WINDOWS\system32>C:\Users\JON\Downloads\sysinternals\PsExec64.exe -s "c:\program files\oracle\virtualbox\vboxmanage" controlvm "witty-kelpie" --natpf1 "TCP DNS,tcp,127.0.0.1,53,,53" --natpf1 "UDP DNS,udp,127.0.0.1,53,,53"

If we then decide we don’t need those NAT rules any more, we just (with psexec and appropriate paths) issue: vboxmanage controlvm "vm-name" --natpf1 delete "TCP DNS"

Using ifupdown instead of netplan

Late Edit 2020-04-01: On Github, someone asked me how they could use the same type of config with netplan, but instead on a 16.04 system. Ubuntu 16.04 doesn’t use netplan, but instead uses ifupdown instead. Here’s how to configure the file for ifupdown:

You can either add the following stanzas to /etc/network/interfaces, or create a separate file for each interface in /etc/network/interfaces.d/<number>-<interface>.cfg (e.g. /etc/network/interfaces.d/10-enp0s8.cfg)

allow-hotplug enp0s8
iface enp0s8 inet dhcp
  metric 10

To re-iterate, in the above netplan file, the interfaces we identified were: enp0s8, enp0s9, enp0s10, enp0s16, enp0s17, enp0s18 and enp0s19. Each interface was incrementally assigned a route metric, starting at 10 and ending at 16, so enp0s8 has a metric of 10, while enp0s16 has a metric of 13, and so on. To build these files, I’ve created this brief shell script you could use:

export metric=10
for int in 8 9 10 16 17 18 19
do
  echo -e "allow-hotplug enp0s${int}\niface enp0s${int} inet dhcp\n  metric $metric" > /etc/network/interfaces.d/enp0s${int}.cfg
  ((metric++))
done

As before, you could reboot to make the changes to the interfaces. Bear in mind, however, that unlike Netplan, these interfaces will try and DHCP on boot with this configuration, so boot time will take longer if every interface attached isn’t connected to a network.

Using NAT Network instead of NAT Interface

Late update 2020-05-26: Ruzsinsky contacted me by email to ask how I’d use a “NAT Network” instead of a “NAT interface”. Essentially, it’s the same as the Bridged interface above, with one other tweak first, we need to create the Net Network, with this command (as an Admin)

C:\WINDOWS\system32>C:\Users\JON\Downloads\sysinternals\PsExec64.exe -s "c:\program files\oracle\virtualbox\vboxmanage" natnetwork add --netname MyNet --network 192.0.2.0/24

Next, stop your multipass virtual machine with multipass stop "witty-kelpie", and configure your second interface, like this:

C:\WINDOWS\system32>C:\Users\JON\Downloads\sysinternals\PsExec64.exe -s "c:\program files\oracle\virtualbox\vboxmanage" modifyvm "witty-kelpie" --nic2 natnetwork --nat-network2 "MyNet"

PsExec v2.2 - Execute processes remotely
Copyright (C) 2001-2016 Mark Russinovich
Sysinternals - www.sysinternals.com

c:\program files\oracle\virtualbox\vboxmanage exited on MINILITH with error code 0.

Start the vm with multipass start "witty-kelpie", open a shell with it multipass shell "witty-kelpie", become root sudo -i and then configure the interface in /etc/netplan/60-extra-interfaces.yaml like we did before:

network:
  ethernets:
    enp0s8:
      optional: yes
      dhcp4: yes
      dhcp4-overrides:
        route-metric: 10

And then run netplan apply or reboot.

What I would say, however, is that the first interface seems to be expected to be a NAT interface, at which point, having a NAT network as well seems a bit pointless. You might be better off using a “Host Only” (or “Private”) network for any inter-host communications between nodes at a network level… But you know your environments and requirements better than I do :)

Featured image is “vieux port Marseille” by “Jeanne Menjoulet” on Flickr and is released under a CC-BY-ND license.

Today I learned… that you can look at the “cloud-init” files on your target server…

Today I have been debugging why my Cloud-init scripts weren’t triggering on my Openstack environment.

I realised that something was wrong when I tried to use the noVNC console[1] with a password I’d set… no luck. So, next I ran a command to review the console logs[2], and saw a message (now, sadly, long gone – so I can’t even include it here!) suggesting there was an issue parsing my YAML file. Uh oh!

I’m using Ansible’s os_server module, and using templates to complete the userdata field, which in turn gets populated as cloud-init scripts…. and so clearly I had two ways to debug this – prefix my ansible playbook with a few debug commands, but then that can get messy… OR SSH into the box, and look through the logs. I knew I could SSH in, so the cloud-init had partially fired, but it just wasn’t parsing what I’d submitted. I had a quick look around, and found a post which mentioned debugging cloud-init. This mentioned that there’s a path (/var/lib/cloud/instances/$UUID/) you can mess around in, to remove some files to “fool” cloud-init into thinking it’s not been run… but, I reasoned, why not just see what’s there.

And in there, was the motherlode – user-data.txt…. bingo.

In the jinja2 template I was using to populate the userdata, I’d referenced another file, again using a template. It looks like that template needs an extra line at the end, otherwise, it all runs together.

Whew!

This does concern me a little, as I had previously been using this stanza to “simply” change the default user password to something a little less complicated:


#cloud-config
ssh_pwauth: True
chpasswd:
  list: |
    ubuntu:{{ default_password }}
  expire: False

But now that I look at the documentation, I realise you can also specify that as a pre-hashed value (in which case, you would suffix that default_password item above with |password_hash('sha512')) which makes it all better again!

[1] If you run openstack --os-cloud cloud_a console url show servername gives you a URL to visit that has an HTML5 based VNC-ish client. Note the “cloud_a” and “servername” should be replaced by your clouds.yml reference and the server name or server ID you want to connect to.
[2] Like before, openstack --os-cloud cloud_a console log show servername gives you the output of the boot sequence (e.g. dmesg plus the normal startup commands, and finally, cloud-init). It can be useful. Equally, it’s logs… which means there’s a lot to wade through!