"inventory" by "Lee" on Flickr

Using a AWS Dynamic Inventory with Ansible 2.10

In Ansible 2.10, Ansible started bundling modules and plugins as “Collections”, basically meaning that Ansible didn’t need to make a release every time a vendor wanted to update the libraries it required, or API changes required new fields to be supplied to modules. As part of this split between “Collections” and “Core”, the AWS modules and plugins got moved into a collection.

Now, if you’re using Ansible 2.9 or earlier, this probably doesn’t impact you, but there are some nice features in Ansible 2.10 that I wanted to use, so… buckle up :)

Getting started with Ansible 2.10, using a virtual environment

If you currently are using Ansible 2.9, it’s probably worth creating a “python virtual environment”, or “virtualenv” to try out Ansible 2.10. I did this on my Ubuntu 20.04 machine by typing:

sudo apt install -y virtualenv
mkdir -p ~/bin
cd ~/bin
virtualenv -p python3 ansible_2.10

The above ensures that you have virtualenv installed, creates a directory called “bin” in your home directory, if it doesn’t already exist, and then places the virtual environment, using Python3, into a directory there called “ansible_2.10“.

Whenever we want to use this new environment you must activate it, using this command:

source ~/bin/ansible_2.10/bin/activate

Once you’ve executed this, any binary packages created in that virtual environment will be executed from there, in preference to the file system packages.

You can tell that you’ve “activated” this virtual environment, because your prompt changes from user@HOST:~$ to (ansible_2.10) user@HOST:~$ which helps πŸ˜€

Next, let’s create a requirements.txt file. This will let us install the environment in a repeatable manner (which is useful with Ansible). Here’s the content of this file.

ansible>=2.10
boto3
botocore

So, this isn’t just Ansible, it’s also the supporting libraries we’ll need to talk to AWS from Ansible.

We execute the following command:

pip install -r requirements.txt

Note, on Windows Subsystem for Linux version 1 (which I’m using) this will take a reasonable while, particularly if it’s crossing from the WSL environment into the Windows environment, depending on where you have specified the virtual environment to be placed.

If you get an error message about something to do with being unable to install ffi, then you’ll need to install the package libffi-dev with sudo apt install -y libffi-dev and then re-run the pip install command above.

Once the installation has completed, you can run ansible --version to see something like the following:

ansible 2.10.2
  config file = None
  configured module search path = ['/home/user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/user/ansible_2.10/lib/python3.8/site-packages/ansible
  executable location = /home/user/ansible_2.10/bin/ansible
  python version = 3.8.2 (default, Jul 16 2020, 14:00:26) [GCC 9.3.0]

Configuring Ansible for local collections

Ansible relies on certain paths in the filesystem to store things like collections, roles and modules, but I like to circumvent these things – particularly if I’m developing something, or moving from one release to the next. Fortunately, Ansible makes this very easy, using a single file, ansible.cfg to tell the code that’s running in this path where to find things.

A quick note on File permissions with ansible.cfg

Note that the POSIX file permissions for the directory you’re in really matter! It must be set to 775 (-rwxrwxr-x) as a maximum – if it’s “world writable” (the last number) it won’t use this file! Other options include 770, 755. If you accidentally set this as world writable, or are using a directory from the “Windows” side of WSL, then you’ll get an error message like this:

[WARNING]: Ansible is being run in a world writable directory (/home/user/ansible_2.10_aws), ignoring it as an ansible.cfg source. For more information see
https://docs.ansible.com/ansible/devel/reference_appendices/config.html#cfg-in-world-writable-dir

That link is this one: https://docs.ansible.com/ansible/devel/reference_appendices/config.html#cfg-in-world-writable-dir and has some useful advice.

Back to configuring Ansible

In ansible.cfg, I have the following configured:

[defaults]
collections_paths = ./collections:~/.ansible/collections:/usr/share/ansible/collections

This file didn’t previously exist in this directory, so I created that file.

This block asks Ansible to check the following paths in order:

  • collections in this path (e.g. /home/user/ansible_2.10_aws/collections)
  • collections in the .ansible directory under the user’s home directory (e.g. /home/user/.ansible/collections)
  • and finally /usr/share/ansible/collections for system-wide collections.

If you don’t configure Ansible with the ansible.cfg file, the default is to store the collections in ~/.ansible/collections, but you can “only have one version of the collection”, so this means that if you’re relying on things not to change when testing, or if you’re running multiple versions of Ansible on your system, then it’s safest to store the collections in the same file tree as you’re working in!

Installing Collections

Now we have Ansible 2.10 installed, and our Ansible configuration file set up, let’s get our collection ready to install. We do this with a requirements.yml file, like this:

---
collections:
- name: amazon.aws
  version: ">=1.2.1"

What does this tell us? Firstly, that we want to install the Amazon AWS collection from Ansible Galaxy. Secondly that we want at least the most current version (which is currently version 1.2.1). If you leave the version line out, it’ll get “the latest” version. If you replace ">=1.2.1" with 1.2.1 it’ll install exactly that version from Galaxy.

If you want any other collections, you add them as subsequent lines (more details here), like this:

collections:
- name: amazon.aws
  version: ">=1.2.1"
- name: some.other
- name: git+https://example.com/someorg/somerepo.git
  version: 1.0.0
- name: git@example.com:someorg/someotherrepo.git

Once we’ve got this file, we run this command to install the content of the requirements.yml: ansible-galaxy collection install -r requirements.yml

In our case, this installs just the amazon.aws collection, which is what we want. Fab!

Getting our dynamic inventory

Right, so we’ve got all the pieces now that we need! Let’s tell Ansible that we want it to ask AWS for an inventory. There are three sections to this.

Configuring Ansible, again!

We need to open up our ansible.cfg file. Because we’re using the collection to get our Dynamic Inventory plugin, we need to tell Ansible to use that plugin. Edit ./ansible.cfg in your favourite editor, and add this block to the end:

[inventory]
enable_plugins = aws_ec2

If you previously created the ansible.cfg file when you were setting up to get the collection installed alongside, then your ansible.cfg file will look (something) like this:

[defaults]
collections_paths     = ./collections:~/.ansible/collections:/usr/share/ansible/collections

[inventory]
enable_plugins = amazon.aws.aws_ec2

Configure AWS

Your machine needs to have access tokens to interact with the AWS API. These are stored in ~/.aws/credentials (e.g. /home/user/.aws/credentials) and look a bit like this:

[default]
aws_access_key_id = A1B2C3D4E5F6G7H8I9J0
aws_secret_access_key = A1B2C3D4E5F6G7H8I9J0a1b2c3d4e5f6g7h8i9j0

Set up your inventory

In a bit of a change to how Ansible usually does the inventory, to have a plugin based dynamic inventory, you can’t specify a file any more, you have to specify a directory. So, create the file ./inventory/aws_ec2.yaml (having created the directory inventory first). The file contains the following:

---
plugin: amazon.aws.aws_ec2

Late edit 2020-12-01: Further to the comment by Giovanni, I’ve amended this file snippet from plugin: aws_ec2 to plugin: amazon.aws.aws_ec2.

By default, this just retrieves the hostnames of any running EC2 instance, as you can see by running ansible-inventory -i inventory --graph

@all:
  |--@aws_ec2:
  |  |--ec2-176-34-76-187.eu-west-1.compute.amazonaws.com
  |  |--ec2-54-170-131-24.eu-west-1.compute.amazonaws.com
  |  |--ec2-54-216-87-131.eu-west-1.compute.amazonaws.com
  |--@ungrouped:

I need a bit more detail than this – I like to use the tags I assign to AWS assets to decide what I’m going to target the machines with. I also know exactly which regions I’ve got my assets in, and what I want to use to get the names of the devices, so this is what I’ve put in my aws_ec2.yaml file:

---
plugin: amazon.aws.aws_ec2
keyed_groups:
- key: tags
  prefix: tag
- key: 'security_groups|json_query("[].group_name")'
  prefix: security_group
- key: placement.region
  prefix: aws_region
- key: tags.Role
  prefix: role
regions:
- eu-west-1
hostnames:
- tag:Name
- dns-name
- public-ip-address
- private-ip-address

Late edit 2020-12-01: Again, I’ve amended this file snippet from plugin: aws_ec2 to plugin: amazon.aws.aws_ec2.

Now, when I run ansible-inventory -i inventory --graph, I get this output:

@all:
  |--@aws_ec2:
  |  |--euwest1-firewall
  |  |--euwest1-demo
  |  |--euwest1-manager
  |--@aws_region_eu_west_1:
  |  |--euwest1-firewall
  |  |--euwest1-demo
  |  |--euwest1-manager
  |--@role_Firewall:
  |  |--euwest1-firewall
  |--@role_Firewall_Manager:
  |  |--euwest1-manager
  |--@role_VM:
  |  |--euwest1-demo
  |--@security_group_euwest1_allow_all:
  |  |--euwest1-firewall
  |  |--euwest1-demo
  |  |--euwest1-manager
  |--@tag_Name_euwest1_firewall:
  |  |--euwest1-firewall
  |--@tag_Name_euwest1_demo:
  |  |--euwest1-demo
  |--@tag_Name_euwest1_manager:
  |  |--euwest1-manager
  |--@tag_Role_Firewall:
  |  |--euwest1-firewall
  |--@tag_Role_Firewall_Manager:
  |  |--euwest1-manager
  |--@tag_Role_VM:
  |  |--euwest1-demo
  |--@ungrouped:

To finish

Now you have your dynamic inventory, you can target your playbook at any of the groups listed above (like role_Firewall, aws_ec2, aws_region_eu_west_1 or some other tag) like you would any other inventory assignment, like this:

---
- hosts: role_Firewall
  gather_facts: false
  tasks:
  - name: Show the name of this device
    debug:
      msg: "{{ inventory_hostname }}"

And there you have it. Hope this is useful!

Late edit: 2020-11-23: Following a conversation with Andy from Work, we’ve noticed that if you’re trying to do SSM connections, rather than username/password based ones, you might want to put this in your aws_ec2.yml file:

---
plugin: amazon.aws.aws_ec2
hostnames:
  - tag:Name
compose:
  ansible_host: instance_id
  ansible_connection: 'community.aws.aws_ssm'

Late edit 2020-12-01: One final instance, I’ve changed plugin: aws_ec2 to plugin: amazon.aws.aws_ec2.

This will keep your hostnames “pretty” (with whatever you’ve tagged it as), but will let you connect over SSM to the Instance ID. Good fun :)

Featured image is β€œinventory” by β€œLee” on Flickr and is released under a CC-BY-SA license.

"Submarine" by "NH53" on Flickr

Recursive Git Submodules

One of my colleagues asked today about using recursive git submodules. First, let’s quickly drill into what a Submodule is.

Git Submodules

A submodule is a separate git repository, attached to the git repository you’re working on via two “touch points” – a file in the root directory called .gitmodules, and, when checked out, the HEAD file in the .git directory.

When you clone a repository with a submodule attached, it creates the directory the submodule will be cloned into, but leave it empty, unless you either do git submodule update --init --recursive or, when you clone the repository initially, you can ask it to pull any recursive submodules, like this git clone https://your.vcs.example.org/someorg/somerepo.git --recursive.

Git stores the commit reference of the submodule (via a file in .git/modules/$SUBMODULE_NAME/HEAD which contains the commit reference). If you change a file in that submodule, it marks the path of the submodule as “dirty” (because you have an uncommitted change), and if you either commit that change, or pull an updated commit from the source repository, then it will mark the path of the submodule as having changed.

In other words, you can track two separate but linked parts of your code in the same tree, working on each in turn, and without impacting each other code base.

I’ve used this, mostly with Ansible playbooks, where I’ve consumed someone else’s role, like this:

My_Project
|
+- Roles
|  |
|  +- <SUBMODULE> someorg.some_role
|  +- <SUBMODULE> anotherorg.another_role
+- inventory
+- playbook.yml
+- .git
|  |
|  +- HEAD
|  +- modules
|  +- etc
+- .gitmodules

In .gitmodules the file looks like this:

[submodule "module1"]
 path = module1
 url = https://your.vcs.example.org/someorg/module1.git

Once you’ve checked out this submodule, you can do any normal operations in this submodule, like pulls, pushes, commits, tags, etc.

So, what happens when you want to nest this stuff?

Nesting Submodule Recursion

So, my colleague wanted to have files in three layers of directories. In this instance, I’ve simulated this by creating three directories, root, module1 and module2. Typically these would be pulled from their respective Git Service paths, like GitHub or GitLab, but here I’m just using everything on my local file system. Where, in the following screen shot, you see /tmp/ you could easily replace that with https://your.vcs.example.org/someorg/.

The output of running mkdir {root,module1,module2} ; cd root ; git init ; cd ../module1 ; git init ; cd ../module2 ; git init ; touch README.md ; git add README.md ; git commit -m 'Added README.md' ; cd ../module1 ; git submodule add /tmp/module2 module2 ; git commit -m 'Added module2' ; cd ../root ; git submodule add /tmp/module1 module1 ; git submodule update --init --recursive ; tree showing the resulting tree of submodules under the root directory.
The output of running mkdir {root,module1,module2} ; cd root ; git init ; cd ../module1 ; git init ; cd ../module2 ; git init ; touch README.md ; git add README.md ; git commit -m ‘Added README.md’ ; cd ../module1 ; git submodule add /tmp/module2 module2 ; git commit -m ‘Added module2’ ; cd ../root ; git submodule add /tmp/module1 module1 ; git submodule update –init –recursive ; tree showing the resulting tree of submodules under the root directory.

So, here, we’ve created these three paths (basically to initiate the repositories), added a basic commit to the furthest submodule (module2), then done a submodule add into the next furthest submodule (module1) and finally added that into the root tree.

Note, however, when you perform the submodule add it doesn’t automatically clone any submodules, and if you were to, from another machine, perform git clone https://your.vcs.example.org/someorg/root.git you wouldn’t get any of the submodules (neither module1 nor module2) without adding either --recursive to the clone command (like this: git clone --recursive https://your.vcs.example.org/someorg/root.git), or by running the follow-up command git submodule update --init --recursive.

Oh, and if any of these submodules are updated? You need to go in and pull those updates, and then commit that change, like this!

The workflow of pulling updates for each of the submodules, with git add, git commit, and git pull, also noting that when a module has been changed, it shows as having “new commits”.
And here we have the finish of the workflow, updating the other submodules. Note that some of these steps (probably the ones in the earlier image) are likely to have been performed by some other developer on another system, so having all the updates on one machine is pretty rare!

The only thing which isn’t in these submodules is if you’ve done a git clone of the root repo (using the terms from the above screen images), the submodules won’t be using the “master” branch (or a particular “tag” or “branch hame”, for that matter), but will instead be using the commit reference. If you wanted to switch to a specific branch or tag, then you’d need to issue the command git checkout some_remote/some_branch or git checkout master instead of (in the above screen captures) git pull.

If you have any questions or issues with this post, please either add a comment, or contact me via one of the methods at the top or side of this page!

Featured image is β€œSubmarine” by β€œNH53” on Flickr and is released under a CC-BY license.

"Field Notes - Sweet Tooth" by "The Marmot" on Flickr

Multi-OS builds in AWS with Terraform – some notes from the field!

Late edit: 2020-05-22 – Updated with better search criteria from colleague conversations

I’m building a proof of concept for … well, a product that needs testing on several different Linux and Windows variants on AWS and Azure. I’m building this environment with Terraform, and it’s thrown me a few curve balls, so I thought I’d document the issues I’ve had!

The versions of distributions I have tested are the latest releases of each of these images at-or-near the time of writing. The major version listed is the earliest I have tested, so no assumption is made about previous versions, and later versions, after the time of this post should not assume any of this data is also accurate!

(Fujitsu Staff – please contact me on my work email address for details on how to get the internal AMIs of our builds of these images πŸ˜„)

Linux Distributions

On the whole, I tend to be much more confident and knowledgable about Linux distributions. I’ve also done far more installs of each of these!

Almost all of these installs are Free of Charge, with the exception of Red Hat Enterprise Linux, which requires a subscription fee, and this can be “Pay As You Go” or “Bring Your Own License”. These sorts of things are arranged for me, so I don’t know how easy or hard it is to organise these licenses!

These builds all use cloud-init, via either a cloud-init yaml script, or some shell scripting language (usually accepted to be bash). If this script fails to execute, you will find your user-data file in /var/lib/cloud/instance/scripts/part-001. If this is a shell script then you will be able to execute it by running that script as your root user.

Amazon Linux 2 or Amzn2

Amazon Linux2 is the “preferred” distribution for Amazon Web Services (AWS) (surprisingly enough). It is based on Red Hat Enterprise Linux (RHEL), and many of the instructions you’ll want to run to install software will use RHEL based instructions. This platform is not available outside the AWS ecosystem, as far as I can tell, although you might be able to run it on-prem.

Software packages are limited in this distribution, so any “extra” features require the installation of the “EPEL” repository, by executing the command sudo amazon-linux-extras install epel and then using the yum command to install further packages. I needed nginx for part of my build, and this was only in EPEL.

Amzn2 AMI Lookup

data "aws_ami" "amzn2" {
  most_recent = true

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-2.0.*-gp2"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["amazon"] # Canonical
}

Amzn2 User Account

Amazon Linux 2 images under AWS have a default “ec2-user” user account. sudo will allow escalation to Root with no password prompt.

Amzn2 AWS Interface Configuration

The primary interface is called eth0. Network Manager is not installed. To manage the interface, you need to edit /etc/sysconfig/network-scripts/ifcfg-eth0 and apply changes with ifdown eth0 ; ifup eth0.

Amzn2 user-data / Cloud-Init Troubleshooting

I’ve found the output from user-data scripts appearing in /var/log/cloud-init-output.log.

CentOS 7

For starters, AWS doesn’t have an official CentOS8 image, so I’m a bit stymied there! In fact, as far as I can make out, CentOS is only releasing ISOs for builds now, and not any cloud images. There’s an open issue on their bug tracker which seems to suggest that it’s not going to get any priority any time soon! Blimey.

This image may require you to “subscribe” to the image (particularly if you have a “private marketplace”), but this will be requested of you (via a URL provided on screen) when you provision your first machine with this AMI.

Like with Amzn2, CentOS7 does not have nginx installed, and like Amzn2, installation of the EPEL library is not a difficult task. CentOS7 bundles a file to install the EPEL, installed by running yum install epel-release. After this is installed, you have the “full” range of software in EPEL available to you.

CentOS AMI Lookup

data "aws_ami" "centos7" {
  most_recent = true

  filter {
    name   = "name"
    values = ["CentOS Linux 7*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["aws-marketplace"]
}

CentOS User Account

CentOS7 images under AWS have a default “centos” user account. sudo will allow escalation to Root with no password prompt.

CentOS AWS Interface Configuration

The primary interface is called eth0. Network Manager is not installed. To manage the interface, you need to edit /etc/sysconfig/network-scripts/ifcfg-eth0 and apply changes with ifdown eth0 ; ifup eth0.

CentOS Cloud-Init Troubleshooting

I’ve run several different user-data located bash scripts against this system, and the logs from these scripts are appearing in the default syslog file (/var/log/syslog) or by running journalctl -xefu cloud-init. They do not appear in /var/log/cloud-init-output.log.

Red Hat Enterprise Linux (RHEL) 7 and 8

Red Hat has both RHEL7 and RHEL8 images in the AWS market place. The Proof Of Value (POV) I was building was only looking at RHEL7, so I didn’t extensively test RHEL8.

Like Amzn2 and CentOS7, RHEL7 needs EPEL installing to have additional packages installed. Unlike Amzn2 and CentOS7, you need to obtain the EPEL package from the Fedora Project. Do this by executing these two commands:

wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum install epel-release-latest-7.noarch.rpm

After this is installed, you’ll have access to the broader range of software that you’re likely to require. Again, I needed nginx, and this was not available to me with the stock install.

RHEL7 AMI Lookup

data "aws_ami" "rhel7" {
  most_recent = true

  filter {
    name   = "name"
    values = ["RHEL-7*GA*Hourly*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["309956199498"] # Red Hat
}

RHEL8 AMI Lookup

data "aws_ami" "rhel8" {
  most_recent = true

  filter {
    name   = "name"
    values = ["RHEL-8*HVM-*Hourly*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["309956199498"] # Red Hat
}

RHEL User Accounts

RHEL7 and RHEL8 images under AWS have a default “ec2-user” user account. sudo will allow escalation to Root with no password prompt.

RHEL AWS Interface Configuration

The primary interface is called eth0. Network Manager is installed, and the eth0 interface has a profile called “System eth0” associated to it.

RHEL Cloud-Init Troubleshooting

In RHEL7, as per CentOS7, logs from user-data scripts are appear in the general syslog file (in this case, /var/log/messages) or by running journalctl -xefu cloud-init. They do not appear in /var/log/cloud-init-output.log.

In RHEL8, logs from user-data scrips now appear in /var/log/cloud-init-output.log.

Ubuntu 18.04

At the time of writing this, the vendor, who’s product I was testing, categorically stated that the newest Ubuntu LTS, Ubuntu 20.04 (Focal Fossa) would not be supported until some time after our testing was complete. As such, I spent no time at all researching or planning to use this image.

Ubuntu is the only non-RPM based distribution in this test, instead being based on the Debian project’s DEB packages. As such, it’s range of packages is much wider. That said, for the project I was working on, I required a later version of nginx than was available in the Ubuntu Repositories, so I had to use the nginx Personal Package Archive (PPA). To do this, I found the official PPA for the nginx project, and followed the instructions there. Generally speaking, this would potentially risk any support from the distribution vendor, as it’s not certified or supported by the project… but I needed that version, so I had to do it!

Ubuntu 18.04 AMI Lookup

data "aws_ami" "ubuntu1804" {
  most_recent = true

  filter {
    name   = "name"
    values = ["*ubuntu*18.04*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["099720109477"] # Canonical
}

Ubuntu 18.04 User Accounts

Ubuntu 18.04 images under AWS have a default “ubuntu” user account. sudo will allow escalation to Root with no password prompt.

Ubuntu 18.04 AWS Interface Configuration

The primary interface is called eth0. Network Manager is not installed, and instead Ubuntu uses Netplan to manage interfaces. The file to manage the interface defaults is /etc/netplan/50-cloud-init.yaml. If you struggle with this method, you may wish to install ifupdown and define your configuration in /etc/network/interfaces.

Ubuntu 18.04 Cloud-Init Troubleshooting

In Ubuntu 18.04, logs from user-data scrips appear in /var/log/cloud-init-output.log.

Windows

This section is far more likely to have it’s data consolidated here!

Windows has a common “standard” username – Administrator, and a common way of creating a password (this is generated on-boot, and the password is transferred to the AWS Metadata Service, which it is retrieved and decrypted with the SSH key you’ve used to build the “authentication” to the box) which Terraform handles quite nicely.

The network device is referred to as “AWS PV Network Device #0”. It can be managed with powershell, netsh (although apparently Microsoft are rumbling about demising this script), or from the GUI.

Windows 2012R2

This version is very old now, and should be compared to Windows 7 in terms of age. It is only supported by Microsoft with an extended maintenance package!

Windows 2012R2 AMI Lookup

data "aws_ami" "w2012r2" {
  most_recent = true

  filter {
    name = "name"
    values = ["Windows_Server-2012-R2_RTM-English-64Bit-Base*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["801119661308"] # AWS
}

Windows 2012R2 Cloud-Init Troubleshooting

Logs from the Metadata Service can be found in C:\Program Files\Amazon\Ec2ConfigService\Logs\Ec2ConfigLog.txt. You can also find the userdata script in C:\Program Files\Amazon\Ec2ConfigService\Scripts\UserScript.ps1. This can be launched and debugged using PowerShell ISE, which is in the “Start” menu.

Windows 2016

This version is reasonably old now, and should be compared to Windows 8 in terms of age. It is supported until 2022 in “mainline” support.

Windows 2016 AMI Lookup

data "aws_ami" "w2016" {
  most_recent = true

  filter {
    name = "name"
    values = ["Windows_Server-2016-English-Full-Base*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["801119661308"] # AWS
}

Windows 2016 Cloud-Init Troubleshooting

The metadata service has moved from Windows 2016 and onwards. Logs are stored in a partially hidden directory tree, so you may need to click in the “Address” bar of the Explorer window and type in part of this path. The path to these files is: C:\ProgramData\Amazon\EC2-Windows\Launch\Log. I say “files” as there are two parts to this file – an “Ec2Launch.log” file which reports on the boot process, and “UserdataExecution.log” which shows the output from the userdata script.

Unlike with the Windows 2012R2 version, you can’t get hold of the actual userdata script on the filesystem, you need to browse to a special path in the metadata service (actually, technically, you can do this with any of the metadata services – OpenStack, Azure, and so on) which is: http://169.254.169.254/latest/user-data/

This will contain userdata between a <powershell> and </powershell> pair of tags. This would need to be copied out of this URL and pasted into a new file on your local machine to determine why issues are occurring. Again, I would recommend using PowerShell ISE from the Start Menu to debug your code.

Windows 2019

This version is the most recent released version of Windows Server, and should be compared to Windows 10 in terms of age.

Windows 2019 AMI Lookup

data "aws_ami" "w2019" {
  most_recent = true

  filter {
    name = "name"
    values = ["Windows_Server-2019-English-Full-Base*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["801119661308"] # AWS
}

Windows 2019 Cloud-Init Troubleshooting

Functionally, the same as Windows 2016, but to recap, the metadata service has moved from Windows 2016 and onwards. Logs are stored in a partially hidden directory tree, so you may need to click in the “Address” bar of the Explorer window and type in part of this path. The path to these files is: C:\ProgramData\Amazon\EC2-Windows\Launch\Log. I say “files” as there are two parts to this file – an “Ec2Launch.log” file which reports on the boot process, and “UserdataExecution.log” which shows the output from the userdata script.

Unlike with the Windows 2012R2 version, you can’t get hold of the actual userdata script on the filesystem, you need to browse to a special path in the metadata service (actually, technically, you can do this with any of the metadata services – OpenStack, Azure, and so on) which is: http://169.254.169.254/latest/user-data/

This will contain userdata between a <powershell> and </powershell> pair of tags. This would need to be copied out of this URL and pasted into a new file on your local machine to determine why issues are occurring. Again, I would recommend using PowerShell ISE from the Start Menu to debug your code.

Featured image is β€œField Notes – Sweet Tooth” by β€œThe Marmot” on Flickr and is released under a CC-BY license.

β€œNew shoes” by β€œMorgaine” from Flickr

Making Windows Cloud-Init Scripts run after a reboot (Using Terraform)

I’m currently building a Proof Of Value (POV) environment for a product, and one of the things I needed in my environment was an Active Directory domain.

To do this in AWS, I had to do the following steps:

  1. Build my Domain Controller
    1. Install Windows
    2. Set the hostname (Reboot)
    3. Promote the machine to being a Domain Controller (Reboot)
    4. Create a domain user
  2. Build my Member Server
    1. Install Windows
    2. Set the hostname (Reboot)
    3. Set the DNS client to point to the Domain Controller
    4. Join the server to the domain (Reboot)

To make this work, I had to find a way to trigger build steps after each reboot. I was working with Windows 2012R2, Windows 2016 and Windows 2019, so the solution had to be cross-version. Fortunately I found this script online! That version was great for Windows 2012R2, but didn’t cover Windows 2016 or later… So let’s break down what I’ve done!

In your userdata field, you need to have two sets of XML strings, as follows:

<persist>true</persist>
<powershell>
$some = "powershell code"
</powershell>

The first block says to Windows 2016+ “keep trying to run this script on each boot” (note that you need to stop it from doing non-relevant stuff on each boot – we’ll get to that in a second!), and the second bit is the PowerShell commands you want it to run. The rest of this now will focus just on the PowerShell block.

  $path= 'HKLM:\Software\UserData'
  
  if(!(Get-Item $Path -ErrorAction SilentlyContinue)) {
    New-Item $Path
    New-ItemProperty -Path $Path -Name RunCount -Value 0 -PropertyType dword
  }
  
  $runCount = Get-ItemProperty -Path $path -Name Runcount -ErrorAction SilentlyContinue | Select-Object -ExpandProperty RunCount
  
  if($runCount -ge 0) {
    switch($runCount) {
      0 {
        $runCount = 1 + [int]$runCount
        Set-ItemProperty -Path $Path -Name RunCount -Value $runCount
        if ($ver -match 2012) {
          #Enable user data
          $EC2SettingsFile = "$env:ProgramFiles\Amazon\Ec2ConfigService\Settings\Config.xml"
          $xml = [xml](Get-Content $EC2SettingsFile)
          $xmlElement = $xml.get_DocumentElement()
          $xmlElementToModify = $xmlElement.Plugins
          
          foreach ($element in $xmlElementToModify.Plugin)
          {
            if ($element.name -eq "Ec2HandleUserData") {
              $element.State="Enabled"
            }
          }
          $xml.Save($EC2SettingsFile)
        }
        $some = "PowerShell Script"
      }
    }
  }

Whew, what a block! Well, again, we can split this up into a couple of bits.

In the first few lines, we build a pointer, a note which says “We got up to here on our previous boots”. We then read that into a variable and find that number and execute any steps in the block with that number. That’s this block:

  $path= 'HKLM:\Software\UserData'
  
  if(!(Get-Item $Path -ErrorAction SilentlyContinue)) {
    New-Item $Path
    New-ItemProperty -Path $Path -Name RunCount -Value 0 -PropertyType dword
  }
  
  $runCount = Get-ItemProperty -Path $path -Name Runcount -ErrorAction SilentlyContinue | Select-Object -ExpandProperty RunCount
  
  if($runCount -ge 0) {
    switch($runCount) {

    }
  }

The next part (and you’ll repeat it for each “number” of reboot steps you need to perform) says “increment the number” then “If this is Windows 2012, remind the userdata handler that the script needs to be run again next boot”. That’s this block:

      0 {
        $runCount = 1 + [int]$runCount
        Set-ItemProperty -Path $Path -Name RunCount -Value $runCount
        if ($ver -match 2012) {
          #Enable user data
          $EC2SettingsFile = "$env:ProgramFiles\Amazon\Ec2ConfigService\Settings\Config.xml"
          $xml = [xml](Get-Content $EC2SettingsFile)
          $xmlElement = $xml.get_DocumentElement()
          $xmlElementToModify = $xmlElement.Plugins
          
          foreach ($element in $xmlElementToModify.Plugin)
          {
            if ($element.name -eq "Ec2HandleUserData") {
              $element.State="Enabled"
            }
          }
          $xml.Save($EC2SettingsFile)
        }
        
      }

In fact, it’s fair to say that in my userdata script, this looks like this:

  $path= 'HKLM:\Software\UserData'
  
  if(!(Get-Item $Path -ErrorAction SilentlyContinue)) {
    New-Item $Path
    New-ItemProperty -Path $Path -Name RunCount -Value 0 -PropertyType dword
  }
  
  $runCount = Get-ItemProperty -Path $path -Name Runcount -ErrorAction SilentlyContinue | Select-Object -ExpandProperty RunCount
  
  if($runCount -ge 0) {
    switch($runCount) {
      0 {
        ${file("templates/step.tmpl")}

        ${templatefile(
          "templates/rename_windows.tmpl",
          {
            hostname = "SomeMachine"
          }
        )}
      }
      1 {
        ${file("templates/step.tmpl")}

        ${templatefile(
          "templates/join_ad.tmpl",
          {
            dns_ipv4 = "192.0.2.1",
            domain_suffix = "ad.mycorp",
            join_account = "ad\someuser",
            join_password = "SomePassw0rd!"
          }
        )}
      }
    }
  }

Then, after each reboot, you need a new block. I have a block to change the computer name, a block to join the machine to the domain, and a block to install an software that I need.

Featured image is β€œNew shoes” by β€œMorgaine” on Flickr and is released under a CC-BY-SA license.

Opening to my video: Screencast 003 - Gitlab

Screencast 003: Gitlab

I’ve done a new mentoring style video, talking about how to use a self-hosted version of Gitlab for basic group projects and individual projects.

Screencast 003: Gitlab

Also available on Archive.org and LBRY.

Late edit 2020-03-25: To build the Gitlab environment I created, take a look at this git repository, which uses Terraform, some cloud init scripts and an ansible playbook. In particular, look at the following files:

If you just want to build the Gitlab environment, then it’s worth removing or renaming (to anything that isn’t .tf – I use .tf_unload) the files load_aws_module.tf, load_awx_module.tf, load_azure_module.tf

Opening to my video: Screencast 002 - A quick walk through Git

Screencast 002: A quick walk through Git (a mentoring style video)

I have done a follow-up Mentoring style video to support my last one. This video shows how to fix some of the issues in Git I came across in my last mentoring video!

Screencast 002: A quick walk through Git

I took some advice from a colleague who noticed that I skipped past a couple of issues with my Git setup, so I re-did them :) I hope this makes sense, and at 35 minutes, is a bit more understandable than the last 1h15 video!

Also on LBRY and Archive.org

Opening to my video: Screencast 001 - Ansible and Inspec using Vagrant

Screencast 001: Ansible and Inspec with Vagrant and Git (a mentoring style video)

If you’ve ever wondered how I use Ansible and Inspec, or wondered why some of my Vagrant files look like they do, well, I want to start recording some “mentor” style videos… You know how, if you were sitting next to someone who’s a mentor to you, and you watch how they build a solution.

The first one was released last night!

I recently saw a video by Chris Hartjes on how he creates his TDD (Test driven development) based PHP projects, and I really wanted to emulate that style, but talking about the things I use.

This was my second attempt at recording a mentoring style video yesterday, the first was shown to the Admin Admin Podcast listeners group on Telegram, and then sacrificed to the demo gods (there were lots of issues in that first video) never to be seen again.

From a tooling perspective, I’m using a remote virtual machine running Ubuntu Mate 18.04 over RDP (to improve performance) with xrdp and Remmina, OBS is running locally to record the content, and I’m using Visual Studio Code, git, Vagrant and Virtualbox, as well as Ansible and Inspec.

Late edit 2020-02-29: Like videos like this, hate YouTube? It’s also on archive.org: https://archive.org/details/JonTheNiceGuyScreencast001

Late edit 2020-03-01: Popey told me about LBRY.tv when I announced this on the Admin Admin Podcast telegram channel, and so I’ve also copied the video to there: https://lbry.tv/@JonTheNiceGuy:b/Screencast001-Ansible-and-Inspec-with-Vagrant:8

"So many coats..." by "Scott Griggs" on Flickr

Migrating from docker-compose to Kubernetes Files

Just so you know. This is a long article to explain my wandering path through understanding Kubernetes (K8S). It’s not an article to explain to you how to use K8S with your project. I hit a lot of blockers, due to the stack I’m using and I document them all. This means there’s a lot of noise and not a whole lot of Signal.

In a previous blog post I created a docker-compose.yml file for a PHP based web application. Now that I have a working Kubernetes environment, I wanted to port the configuration files into Kubernetes.

Initially, I was pointed at Kompose, a tool for converting docker-compose files to Kubernetes YAML formatted files, and, in fact, this gives me 99% of what I need… except, the current version uses beta API version flags for some of it’s outputted files, and this isn’t valid for the version of Kubernetes I’m running. So, let’s wind things back a bit, and find out what you need to do to use kompose first and then we can tweak the output file next.

Note: I’m running all these commands as root. There’s a bit of weirdness going on because I’m using the snap of Docker and I had a few issues with running these commands as a user… While I could have tried to get to the bottom of this with sudo and watching logs, I just wanted to push on… Anyway.

Here’s our “simple” docker-compose file.

version: '3'
services:
  db:
    build:
      context: .
      dockerfile: mariadb/Dockerfile
    image: localhost:32000/db
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: a_root_pw
      MYSQL_USER: a_user
      MYSQL_PASSWORD: a_password
      MYSQL_DATABASE: a_db
    expose:
      - 3306
  nginx:
    build:
      context: .
      dockerfile: nginx/Dockerfile
    image: localhost:32000/nginx
    ports:
      - 1980:80
  phpfpm:
    build:
      context: .
      dockerfile: phpfpm/Dockerfile
    image: localhost:32000/phpfpm

This has three components – the MariaDB database, the nginx web server and the PHP-FPM CGI service that nginx consumes. The database service exposes a port (3306) to other containers, with a set of hard-coded credentials (yep, that’s not great… working on that one!), while the nginx service opens port 1980 to port 80 in the container. So far, so … well, kinda sensible :)

If we run kompose convert against this docker-compose file, we get five files created; db-deployment.yaml, nginx-deployment.yaml, phpfpm-deployment.yaml, db-service.yaml and nginx-service.yaml. If we were to run kompose up on these, we get an error message…

Well, actually, first, we get a whole load of “INFO” and “WARN” lines up while kompose builds and pushes the containers into the MicroK8S local registry (a registry is a like a package repository, for containers), which is served by localhost:32000 (hence all the image: localhost:3200/someimage lines in the docker-compose.yml file), but at the end, we get (today) this error:

INFO We are going to create Kubernetes Deployments, Services and PersistentVolumeClaims for your Dockerized application. If you need different kind of resources, use the 'kompose convert' and 'kubectl create -f' commands instead.

FATA Error while deploying application: Get http://localhost:8080/api: dial tcp 127.0.0.1:8080: connect: connection refused

Uh oh! Well, this is a known issue at least! Kubernetes used to use, by default, http on port 8080 for it’s service, but now it uses https on port 6443. Well, that’s what I thought! In this issue on the MicroK8S repo, it says that it uses a different port, and you should use microk8s.kubectl cluster-info to find the port… and yep… Kubernetes master is running at https://127.0.0.1:16443. Bah.

root@microk8s-a:~/glowing-adventure# microk8s.kubectl cluster-info
Kubernetes master is running at https://127.0.0.1:16443
Heapster is running at https://127.0.0.1:16443/api/v1/namespaces/kube-system/services/heapster/proxy
CoreDNS is running at https://127.0.0.1:16443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Grafana is running at https://127.0.0.1:16443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
InfluxDB is running at https://127.0.0.1:16443/api/v1/namespaces/kube-system/services/monitoring-influxdb:http/proxy

So, we export the KUBERNETES_MASTER environment variable, which was explained in that known issue I mentioned before, and now we get a credential prompt:

Please enter Username:

Oh no, again! I don’t have credentials!! Fortunately the MicroK8S issue also tells us how to find those! You run microk8s.config and it tells you the username!

roo@microk8s-a:~/glowing-adventure# microk8s.config
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: <base64-data>
    server: https://10.0.2.15:16443
  name: microk8s-cluster
contexts:
- context:
    cluster: microk8s-cluster
    user: admin
  name: microk8s
current-context: microk8s
kind: Config
preferences: {}
users:
- name: admin
  user:
    username: admin
    password: QXdUVmN3c3AvWlJ3bnRmZVJmdFhpNkJ3cDdkR3dGaVdxREhuWWo0MmUvTT0K

So, our username is “admin” and our password is … well, in this case a string starting QX and ending 0K but yours will be different!

We run kompose up again, and put in the credentials… ARGH!

FATA Error while deploying application: Get https://127.0.0.1:16443/api: x509: certificate signed by unknown authority

Well, now, that’s no good! Fortunately, a quick Google later, and up pops this Stack Overflow suggestion (mildly amended for my circumstances):

openssl s_client -showcerts -connect 127.0.0.1:16443 < /dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' | sudo tee /usr/local/share/ca-certificates/k8s.crt
update-ca-certificates
systemctl restart snap.docker.dockerd

Right then. Let’s run that kompose up statement again…

INFO We are going to create Kubernetes Deployments, Services and PersistentVolumeClaims for your Dockerized application. If you need different kind of resources, use the 'kompose convert' and 'kubectl create -f' commands instead.

Please enter Username: 
Please enter Password: 
INFO Deploying application in "default" namespace
INFO Successfully created Service: nginx
FATA Error while deploying application: the server could not find the requested resource

Bah! What resource do I need? Well, actually, there’s a bug in 1.20.0 of Kompose, and it should be fixed in 1.21.0. The “resource” it’s talking about is, I think, that one of the APIs refuses to process the converted YAML files. As a result, the “resource” is the service that won’t start. So, instead, let’s convert the file into the output YAML files, and then take a peak at what’s going wrong.

root@microk8s-a:~/glowing-adventure# kompose convert
INFO Kubernetes file "nginx-service.yaml" created
INFO Kubernetes file "db-deployment.yaml" created
INFO Kubernetes file "nginx-deployment.yaml" created
INFO Kubernetes file "phpfpm-deployment.yaml" created

So far, so good! Now let’s run kubectl apply with each of these files.

root@microk8s-a:~/glowing-adventure# kubectl apply -f nginx-service.yaml
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
service/nginx configured
root@microk8s-a:~# kubectl apply -f nginx-deployment.yaml
error: unable to recognize "nginx-deployment.yaml": no matches for kind "Deployment" in version "extensions/v1beta1"

Apparently the service files are all OK, the problem is in the deployment files. Hmm OK, let’s have a look at what could be wrong. Here’s the output file:

root@microk8s-a:~/glowing-adventure# cat nginx-deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    kompose.cmd: kompose convert
    kompose.version: 1.20.0 (f3d54d784)
  creationTimestamp: null
  labels:
    io.kompose.service: nginx
  name: nginx
spec:
  replicas: 1
  strategy: {}
  template:
    metadata:
      annotations:
        kompose.cmd: kompose convert
        kompose.version: 1.20.0 (f3d54d784)
      creationTimestamp: null
      labels:
        io.kompose.service: nginx
    spec:
      containers:
      - image: localhost:32000/nginx
        name: nginx
        ports:
        - containerPort: 80
        resources: {}
      restartPolicy: Always
status: {}

Well, the extensions/v1beta1 API version doesn’t seem to support “Deployment” options any more, so let’s edit it to change that to what the official documentation example shows today. We need to switch to using the apiVersion: apps/v1 value. Let’s see what happens when we make that change!

root@microk8s-a:~/glowing-adventure# kubectl apply -f nginx-deployment.yaml
error: error validating "nginx-deployment.yaml": error validating data: ValidationError(Deployment.spec): missing required field "selector" in io.k8s.api.apps.v1.DeploymentSpec; if you choose to ignore these errors, turn validation off with --validate=false

Hmm this seems to be a fairly critical issue. A selector basically tells the orchestration engine which images we want to be deployed. Let’s go back to the official example. So, we need to add the “selector” value in the “spec” block, at the same level as “template”, and it needs to match the labels we’ve specified. It also looks like we don’t need most of the metadata that kompose has given us. So, let’s adjust the deployment to look a bit more like that example.

root@microk8s-a:~/glowing-adventure# cat nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: localhost:32000/nginx
        name: nginx
        ports:
        - containerPort: 80
        resources: {}
      restartPolicy: Always

Fab. And what happens when we run it?

root@microk8s-a:~/glowing-adventure# kubectl apply -f nginx-deployment.yaml
deployment.apps/nginx created

Woohoo! Let’s apply all of these now.

root@microk8s-a:~/glowing-adventure# for i in db-deployment.yaml nginx-deployment.yaml nginx-service.yaml phpfpm-deployment.yaml; do kubectl apply -f $i ; done
deployment.apps/db created
deployment.apps/nginx unchanged
service/nginx unchanged
deployment.apps/phpfpm created

Oh, hang on a second, that service (service/nginx) is unchanged, but we changed the label from io.kompose.service: nginx to app: nginx, so we need to fix that. Let’s open it up and edit it!

apiVersion: v1
kind: Service
metadata:
  annotations:
    kompose.cmd: kompose convert
    kompose.version: 1.20.0 (f3d54d784)
  creationTimestamp: null
  labels:
    io.kompose.service: nginx
  name: nginx
spec:
  ports:
  - name: "1980"
    port: 1980
    targetPort: 80
  selector:
    io.kompose.service: nginx
status:
  loadBalancer: {}

Ah, so this has the “annotations” field too, in the metadata, and, as suspected, it’s got the io.kompose.service label as the selector. Hmm OK, let’s fix that.

root@microk8s-a:~/glowing-adventure# cat nginx-service.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  ports:
  - name: "1980"
    port: 1980
    targetPort: 80
  selector:
    app: nginx
status:
  loadBalancer: {}

Much better. And let’s apply it…

root@microk8s-a:~/glowing-adventure# kubectl apply -f nginx-service.yaml
service/nginx configured

Fab! So, let’s review the state of the deployments, the services, the pods and the replication sets.

root@microk8s-a:~/glowing-adventure# kubectl get deploy
NAME     READY   UP-TO-DATE   AVAILABLE   AGE
db       1/1     1            1           8m54s
nginx    0/1     1            0           8m53s
phpfpm   1/1     1            1           8m48s

Hmm. That doesn’t look right.

root@microk8s-a:~/glowing-adventure# kubectl get pod
NAME                      READY   STATUS             RESTARTS   AGE
db-f78f9f69b-grqfz        1/1     Running            0          9m9s
nginx-7774fcb84c-cxk4v    0/1     CrashLoopBackOff   6          9m8s
phpfpm-66945b7767-vb8km   1/1     Running            0          9m3s
root@microk8s-a:~# kubectl get rs
NAME                DESIRED   CURRENT   READY   AGE
db-f78f9f69b        1         1         1       9m18s
nginx-7774fcb84c    1         1         0       9m17s
phpfpm-66945b7767   1         1         1       9m12s

Yep. What does “CrashLoopBackOff” even mean?! Let’s check the logs. We need to ask the pod itself, not the deployment, so let’s use the kubectl logs command to ask.

root@microk8s-a:~/glowing-adventure# kubectl logs nginx-7774fcb84c-cxk4v
2020/01/17 08:08:50 [emerg] 1#1: host not found in upstream "phpfpm" in /etc/nginx/conf.d/default.conf:10
nginx: [emerg] host not found in upstream "phpfpm" in /etc/nginx/conf.d/default.conf:10

Hmm. That’s not good. We were using the fact that Docker just named everything for us in the docker-compose file, but now in Kubernetes, we need to do something different. At this point I ran out of ideas. I asked on the McrTech slack for advice. I was asked to run this command, and would you look at that, there’s nothing for nginx to connect to.

root@microk8s-a:~/glowing-adventure# kubectl get service
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
kubernetes   ClusterIP   10.152.183.1    <none>        443/TCP    24h
nginx        ClusterIP   10.152.183.62   <none>        1980/TCP   9m1s

It turns out that I need to create a service for each of the deployments. So, now I have a separate service for each one. I copied the nginx-service.yaml file into db-service.yaml and phpfpm-service.yaml, edited the files and now… tada!

root@microk8s-a:~/glowing-adventure# kubectl get service
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
db           ClusterIP   10.152.183.61   <none>        3306/TCP   5m37s
kubernetes   ClusterIP   10.152.183.1    <none>        443/TCP    30h
nginx        ClusterIP   10.152.183.62   <none>        1980/TCP   5h54m
phpfpm       ClusterIP   10.152.183.69   <none>        9000/TCP   5m41s

But wait… How do I actually address nginx now? Huh. No external-ip (not even “pending”, which is what I ended up with), no ports to talk to. Uh oh. Now I need to understand how to hook this service up to the public IP of this node. Ahh, see up there it says “ClusterIP”? That means “this service is only available INSIDE the cluster”. If I change this to “NodePort” or “LoadBalancer”, it’ll attach that port to the external interface.

What’s the difference between “NodePort” and “LoadBalancer”? Well, according to this page, if you are using a managed Public Cloud service that supports an external load balancer, then putting this to “LoadBalancer” should attach your “NodePort” to the provider’s Load Balancer automatically. Otherwise, you need to define the “NodePort” value in your config (which must be a value between 30000 and 32767, although that is configurable for the node). Once you’ve done that, you can hook your load balancer up to that port, for example Client -> Load Balancer IP (TCP/80) -> K8S Cluster IP (e.g. TCP/31234)

So, how does this actually look. I’m going to use the “LoadBalancer” option, because if I ever deploy this to “live”, I want it to integrate with the load balancer, but for now, I can cope with addressing a “high port”. Right, well, let me open back up that nginx-service.yaml, and make the changes.

root@microk8s-a:~/glowing-adventure# cat nginx-service.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  type: LoadBalancer
  ports:
  - name: nginx
    nodePort: 30000
    port: 1980
    targetPort: 80
  selector:
    app: nginx
status:
  loadBalancer: {}

The key parts here are the lines type: LoadBalancer and nodePort: 30000 under spec: and ports: respectively. Note that I can use, at this point type: LoadBalancer and type: NodePort interchangably, but, as I said, if you were using this in something like AWS or Azure, you might want to do it differently!

So, now I can curl http://192.0.2.100:30000 (where 192.0.2.100 is the address of my “bridged interface” of K8S environment) and get a response from my PHP application, behind nginx, and I know (from poking at it a bit) that it works with my Database.

OK, one last thing. I don’t really want lots of little files which have got config items in. I quite liked the docker-compose file as it was, because it had all the services in as one block, and I could run “docker-compose up”, but the kompose script split it out into lots of pieces. In Kubernetes, if the YAML file it loads has got a divider in it (a line like this: ---) then it stops parsing it at that point, and starts reading the file after that as a new file. Like this I could have the following layout:

apiVersion: apps/v1
kind: Deployment
more: stuff
---
apiVersion: v1
kind: Service
more: stuff
---
apiVersion: apps/v1
kind: Deployment
more: stuff
---
apiVersion: v1
kind: Service
more: stuff

But, thinking about it, I quite like having each piece logically together, so I really want db.yaml, nginx.yaml and phpfpm.yaml, where each of those files contains both the deployment and the service. So, let’s do that. I’ll do one file, so it makes more sense, and then show you the output.

root@microk8s-a:~/glowing-adventure# mkdir -p k8s
root@microk8s-a:~/glowing-adventure# mv db-deployment.yaml k8s/db.yaml
root@microk8s-a:~/glowing-adventure# echo "---" >> k8s/db.yaml
root@microk8s-a:~/glowing-adventure# cat db-service.yaml >> k8s/db.yaml
root@microk8s-a:~/glowing-adventure# rm db-service.yaml
root@microk8s-a:~/glowing-adventure# cat k8s/db.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: db
  name: db
spec:
  replicas: 1
  selector:
    matchLabels:
      app: db
  template:
    metadata:
      labels:
        app: db
    spec:
      containers:
      - env:
        - name: MYSQL_DATABASE
          value: a_db
        - name: MYSQL_PASSWORD
          value: a_password
        - name: MYSQL_ROOT_PASSWORD
          value: a_root_pw
        - name: MYSQL_USER
          value: a_user
        image: localhost:32000/db
        name: db
        resources: {}
      restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: db
  name: db
spec:
  ports:
  - name: mariadb
    port: 3306
    targetPort: 3306
  selector:
    app: db
status:
  loadBalancer: {}

So, now, if I do kubectl apply -f k8s/db.yaml I’ll get this output:

root@microk8s-a:~/glowing-adventure# kubectl apply -f k8s/db.yaml
deployment.apps/db unchanged
service/db unchanged

You can see the final files in the git repo for this set of tests.

Next episode, I’ll start looking at making my application scale (as that’s the thing that Kubernetes is known for) and having more than one K8S node to house my K8S pods!

Featured image is “So many coats…” by “Scott Griggs” on Flickr and is released under a CC-BY-ND license.

"Shipping Containers" by "asgw" on Flickr

Creating my first Docker containerized LEMP (Linux, nginx, MariaDB, PHP) application

Want to see what I built without reading the why’s and wherefore’s? The git repository with all the docker-compose goodness is here!

Late edit 2020-01-16: The fantastic Jerry Steel, my co-host on The Admin Admin podcast looked at what I wrote, and made a few suggestions. I’ve updated the code in the git repo, and I’ll try to annotate below when I’ve changed something. If I miss it, it’s right in the Git repo!

One of the challenges I set myself this Christmas was to learn enough about Docker to put an arbitrary PHP application, that I would previously have misused Vagrant to contain.

Just before I started down this rabbit hole, I spoke to my Aunt about some family tree research my father had left behind after he died, and how I wished I could easily share the old tree with her (I organised getting her a Chromebook a couple of years ago, after fighting with doing remote support for years on Linux and Windows laptops). In the end, I found a web application for genealogical research called HuMo-gen, that is a perfect match for both projects I wanted to look at.

HuMo-gen was first created in 1999, with a PHP version being released in 2005. It used MySQL or MariaDB as the Database engine. I was reasonably confident that I could have created a Vagrantfile to deliver this on my home server, but I wanted to try something new. I wanted to use the “standard” building blocks of Docker and Docker-Compose, and some common containers to make my way around learning Docker.

I started by looking for some resources on how to build a Docker container. Much of the guidance I’d found was to use Docker-Compose, as this allows you to stand several components up at the same time!

In contrast to how Vagrant works (which is basically a CLI wrapper to many virtual machine services), Docker isolates resources for a single process that runs on a machine. Where in Vagrant, you might run several processes on one machine (perhaps, in this instance, nginx, PHP-FPM and MariaDB), with Docker, you’re encouraged to run each “service” as their own containers, and link them together with an overlay network. It’s possible to also do the same with Vagrant, but you’ll end up with an awful lot of VM overhead to separate out each piece.

So, I first needed to select my services. My initial line-up was:

  • MariaDB
  • PHP-FPM
  • Apache’s httpd2 (replaced by nginx)

I was able to find official Docker images for PHP, MariaDB and httpd, but after extensive tweaking, I couldn’t make the httpd image talk the way I wanted it to with the PHP image. Bowing to what now seems to be conventional wisdom, I swapped out the httpd service for nginx.

One of the stumbling blocks for me, particularly early on, was how to build several different Dockerfiles (these are basically the instructions for the container you’re constructing). Here is the basic outline of how to do this:

version: '3'
services:
  yourservice:
    build:
      context: .
      dockerfile: relative/path/to/Dockerfile

In this docker-compose.yml file, I tell it that to create the yourservice service, it needs to build the docker container, using the file in ./relative/path/to/Dockerfile. This file in turn contains an instruction to import an image.

Each service stacks on top of each other in that docker-compose.yml file, like this:

version: '3'
services:
  service1:
    build:
      context: .
      dockerfile: service1/Dockerfile
    image: localhost:32000/service1
  service2:
    build:
      context: .
      dockerfile: service2/Dockerfile
    image: localhost:32000/service2

Late edit 2020-01-16: This previously listed Dockerfile/service1, however, much of the documentation suggested that Docker gets quite opinionated about the file being called Dockerfile. While docker-compose can work around this, it’s better to stick to tradition :) The docker-compose.yml files below have also been adjusted accordingly. I’ve also added an image: somehost:1234/image_name line to help with tagging the images for later use. It’s not critical to what’s going on here, but I found it useful with some later projects.

To allow containers to see ports between themselves, you add the expose: command in your docker-compose.yml, and to allow that port to be visible from the “outside” (i.e. to the host and upwards), use the ports: command listing the “host” port (the one on the host OS), then a colon and then the “target” port (the one in the container), like these:

version: '3'
services:
  service1:
    build:
      context: .
      dockerfile: service1/Dockerfile
    image: localhost:32000/service1
    expose:
    - 12345
  service2:
    build:
      context: .
      dockerfile: service2/Dockerfile
    image: localhost:32000/service2
    ports:
    - 8000:80

Now, let’s take a quick look into the Dockerfiles. Each “statement” in a Dockerfile adds a new “layer” to the image. For local operations, this probably isn’t a problem, but when you’re storing these images on a hosted provider, you want to keep these images as small as possible.

I built a Database Dockerfile, which is about as small as you can make it!

FROM mariadb:10.4.10

Yep, one line. How cool is that? In the docker-compose.yml file, I invoke this, like this:

version: '3'
services:
  db:
    build:
      context: .
      dockerfile: mariadb/Dockerfile
    image: localhost:32000/db
    restart: always
    environment:
      MYSQL_ROOT_PASSWORD: a_root_pw
      MYSQL_USER: a_user
      MYSQL_PASSWORD: a_password
      MYSQL_DATABASE: a_db
    expose:
      - 3306

OK, so this one is a bit more complex! I wanted it to build my Dockerfile, which is “mariadb/Dockerfile“. I wanted it to restart the container whenever it failed (which hopefully isn’t that often!), and I wanted to inject some specific environment variables into the file – the root and user passwords, a user account and a database name. Initially I was having some issues where it wasn’t building the database with these credentials, but I think that’s because I wasn’t “building” the new database, I was just using it. I also expose the MariaDB (MySQL) port, 3306 to the other containers in the docker-compose.yml file.

Let’s take a look at the next part! PHP-FPM. Here’s the Dockerfile:

FROM php:7.4-fpm
RUN docker-php-ext-install pdo pdo_mysql
ADD --chown=www-data:www-data public /var/www/html

There’s a bit more to this, but not loads. We build our image from a named version of PHP, and install two extensions to PHP, pdo and pdo_mysql. Lastly, we copy the content of the “public” directory into the /var/www/html path, and make sure it “belongs” to the right user (www-data).

I’d previously tried to do a lot more complicated things with this Dockerfile, but it wasn’t working, so instead I slimmed it right down to just this, and the docker-compose.yml is a lot simpler too.

  phpfpm:
    build:
      context: .
      dockerfile: phpfpm/Dockerfile
    image: localhost:32000/phpfpm

See! Loads simpler! Now we need the complicated bit! :) This is the Dockerfile for nginx.

FROM nginx:1.17.7
COPY nginx/default.conf /etc/nginx/conf.d/default.conf

COPY public /var/www/html

Weirdly, even though I’ve added version numbers for MariaDB and PHP, I’ve not done the same for nginx, perhaps I should! Late edit 2020-01-16: I’ve put a version number on there now, previously where it said nginx:1.17.7 it actually said nginx:latest.

I’ve created the configuration block for nginx in a single “RUN” line. Late edit 2020-01-16: This Dockerfile now doesn’t have a giant echo 'stuff' > file block either, following Jerry’s advice, and I’m using COPY instead of ADD on his advice too. I’ll show that config file below. There’s a couple of high points for me here!

server {
  index index.php index.html;
  server_name _;
  error_log /proc/self/fd/2;
  access_log /proc/self/fd/1;
  root /var/www/html;
  location ~ \.php$ {
    try_files $uri =404;
    fastcgi_split_path_info ^(.+\.php)(/.+)$;
    fastcgi_pass phpfpm:9000;
    fastcgi_index index.php;
    include fastcgi_params;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    fastcgi_param PATH_INFO $fastcgi_path_info;
  }
}
  • server_name _; means “use this block for all unnamed requests”.
  • access_log /proc/self/fd/1; and error_log /proc/self/fd/2;These are links to the “stdout” and “stderr” file descriptors (or pointers to other parts of the filesystem), and basically means that when you do docker-compose logs, you’ll see the HTTP logs for the server! These two files are guaranteed to be there, while /dev/stderr isn’t!

Because nginx is “just” caching the web content, and I know the content doesn’t need to be written to from nginx, I knew I didn’t need to do the chown action, like I did with the PHP-FPM block.

Lastly, I need to configure the docker-compose.yml file for nginx:

  nginx:
    build:
      context: .
      dockerfile: Dockerfile/nginx
    image: localhost:32000/nginx
    ports:
      - 127.0.0.1:1980:80

I’ve gone for a slightly unusual ports configuration when I deployed this to my web server… you see, I already have the HTTP port (TCP/80) configured for use on my home server – for running the rest of my web services. During development, on my home machine, the ports line instead showed “1980:80” because I was running this on Instead, I’m running this application bound to “localhost” (127.0.0.1) on a different port number (1980 selected because it could, conceivably, be a birthday of someone on this system), and then in my local web server configuration, I’m proxying connections to this service, with HTTPS encryption as well. That’s all outside the scope of this article (as I probably should be using something like Traefik, anyway) but it shows you how you could bind to a separate port too.

Anyway, that was my Docker journey over Christmas, and I look forward to using it more, going forward!

Featured image is “Shipping Containers” by “asgw” on Flickr and is released under a CC-BY license.