What I Did Last Week – Page 3 – A nice guy's view on life

My talk summaries from BarCamp Manchester 9

2019-09-252019-09-25 JonTheNiceGuy Leave a comment

BarCamp Manchester 9 (#BcMcr9) is a BarCamp style Unconference. It was held in the offices of Auto Trader in the centre of Manchester. It was a two day event, however, I was unable to attend the Saturday. Sundays are usually quieter days, and apparently the numbers were approximately half of the peak of Saturday on the Sunday.

Lunch was provided by Auto Trader. The day was split into 7 slots, or sessions, running for 25 minutes each, with 5 minutes between slots to change rooms. There were three theatre layout rooms, each with a projector, and one room with soft chairs around the edges.

There and Back Again/How The Internet Works

Format: Presentation with s lides. 30ish attendees.

Slot: Slot 1 Sunday 11:00-11:25

Notes: This slide deck was reused from when I delivered it in 2012. Some stuff had changed (the prevalence of WiFi being one, CAT5e being referenced raised some giggles), but most had not.

There were some comments raised during the talk about the slides, but nothing significant (mostly by network engineers, commenting on things like routing a local network. Ugh.)

Following the talk, someone came up to suggest some changes (primarily that the slides need to link back to the graphics created). Someone else noted that there were too many acronyms that should probably have been explained. As such, this deck is likely to change and be published here at some point soon.

I sent a tweet, following this talk:

At #BCMCR9? See my talk on “How The Internet Works” and looking for the slides? See here: https://www.slideshare.net/JonTheNiceGuy/there-and-back-again-15506394 And feel free to message me if you’ve got any questions!
Jon Spriggs (@jontheniceguy) @ Sep 22 12:03pm

Automation in an Infrastructure as Code World

Format: Presentation with slides. 8 attendees, reduced to 4 half way through.

Slot: Slots 5 and 6 Sunday 14:00-14:55

Notes: This was a trial run of my talk for the Fujitsu FDE Conference I’m attending in a couple of weeks. The audience were notified as such. I took two slots on the “grid”, and half way through my session, half the audience walked out.

Following the talk, someone came and suggested some changes, which I’ll be implementing.

The slides for this talk are still being developed and will be shared after the FDE conference on this site.

Decentralised Social Media? – Secure Scuttlebutt

Format: Conversation with a desktop client application (Patchwork) loaded on the projector, and the Google Play entry for Manyverse on a browser tab. 3 attendees.

Slot: Slot 7 (last slot of the day) Sunday 15:00-15:25

Notes: This was an unplanned session, and probably should have been run earlier in the day. The audience members were very interactive, and asked lots of sensible questions.

I sent a tweet, following this talk:

Did you come to my talk at #BcMcr9 about #SecureScuttleButt? If you run a SSB client (patchwork, patchbay, patchfox or manyverse) and want to follow me, I’m @p3gu8eLHxXC0cuvZ0yXSC05ZROB4X7dpxGCEydIHZ0o=.ed25519 and @3SEA7qNZQPiYFCzY6K57f0LTc9l+Bk6cewQc6lbs/Ek=.ed25519
Jon Spriggs (@jontheniceguy) @ Sep 22 4:46pm

And if you want to know more about #SecureScuttlebutt, take a look at http://Scuttlebutt.nz! It’s fun!
Jon Spriggs (@jontheniceguy) @ Sep 22 4:49pm

AWS Game Day

2019-09-182019-09-18 JonTheNiceGuy 2 Comments

I was invited, through work, to participate in an AWS tradition – the AWS Game Day. This event was organised by my employer for our internal staff to experience a day in the life of a fully deployed AWS environment… and have some fun with it too. The AWS Game Day is a common scenario, and if you’re lucky enough to join one, you’ll probably be doing this one… As such, there will be… #NoSpoilers.

A Game Day (sometimes disambiguated as an “Adversarial Game Day”, because of sporting events) is a day where you either have a dummy environment, or, if you have the scale, a portion of your live network is removed from live service and used as a training ground. In this case, AWS provided a specific dummy environment “Unicorn.Rentals”, and all the attendees are the new recruits to the DevOps Team… Oh, and all the previous DevOps team members had just left the company… all at once.

Attendees were split into teams of four, and each team had a disparate background.

We’re given access to;

Our login panel. This gives us our score, our trending increase or decrease in score over the last “period” (I think it was 5 minutes), our access to the AWS console, and a panel to update the CNAME for the DNS records.
AWS Console. This is a mostly unrestricted account in AWS. There are some things we don’t get access to – for example, we didn’t get the CloudFormation Template for setting up the game day, and we couldn’t make changes to the IAM environment at all. Oh, and what was particularly frustrating was not being able to … Oh yes, I forgot, #NoSpoilers ;)
A central scoreboard of all the teams
A running tally of how we were scored
- Each web request served under X seconds received one score
- Each request served between X and Y seconds received another score,
- Each request served over Y seconds received a third score.
- Failing to respond to a request received a negative score.
- Infrastructure costs deducted points from the score (to stop you just putting stuff at ALL THE SERVERS, ALL THE TIME).
The outgoing DevOps team’s “runbook”. Not too dissimilar to the sort of documentation you write before you go on leave. “If this thing break, run this or just reboot the box”, “You might see this fail with something like this message if the server can’t keep up with the load”. Enough to give you a pointer on where to look, not quite enough to give you the answer :)

The environment we were working on was, well, relatively simple. An auto-scaling web service, running a simple binary on an EC2 instance behind a load balancer. We extended the reach of services we could use (#NoSpoilers!) to give us greater up-time, improved responsiveness and broader scope of access. We were also able to monitor … um, things :) and change the way we viewed the application.

I don’t want to give too many details, because it will spoil the surprises, but I will say that we learned a lot about the services in AWS we had access to, which wasn’t the full product set (just “basic” AWS IaaS tooling).

When the event finished, everyone I spoke to agreed that having a game day is a really good idea! One person said “You only really learn something when you fix it! This is like being called out, without the actual impact to a customer” and another said “I’ve done more with AWS in this day than I have the past couple of months since I’ve been looking at it.”

And, as you can probably tell, I agree! I’d love to see more games days like this! I can see how running something like this, on technology you use in your customer estate, can be unbelievably powerful – especially if you’ve got a mildly nefarious GM running some background processes to break things (#NoSpoilers). If you can make it time-sensitive too (“you’ve got one day to restore service”, or like in this case, “every minute we’re not selling product, we’re losing points”), then that makes it feel like you’ve been called out, but without the stress of feeling like you’re actually going to lose your job at the end of the day (not that I’ve ever actually felt like that when I’ve been called out!!)

Anyway, massive kudos to our AWS SE team for delivering the training, and a huge cheer of support to Sara for getting the event organised. I look forward to getting invited to a new scenario sometime soon! ;)

Here are some pictures from the event!

The teams get to know each other, and we find out about the day ahead! Picture by @Fujitsu_FDE.

Our team, becoming a team by changing the table layout! It made a difference, we went to the top of the leader board for at least 5 minutes! Picture by @Fujitsu_FDE.

The final scores. Picture by @Fujitsu_FDE

Our lucky attendees got to win some of these items! Picture by @Fujitsu_FDE

“Well Done” (ha, yehr, right!) to the winning team (“FIX!”) “UnicornsRUs”. Picture by @Fujitsu_FDE.

The featured image is “AWS Game Day Attendees” by @Fujitsu_FDE.

How to quickly get the next SemVer for your app

2019-06-282023-09-06 JonTheNiceGuy 2 Comments

SemVer, short for Semantic Versioning is an easy way of numbering your software versions. They follow the model Major.Minor.Patch, like this 0.9.1 and has a very opinionated view on what is considered a Major “version bump” and what isn’t.

Sometimes, when writing a library, it’s easy to forget what version you’re on. Perhaps you have a feature change you’re working on, but also bug fixes to two or three previous versions you need to keep an eye on? How about an easy way of figuring out what that next bump should be?

In a recent conversation on the McrTech slack, Steven [0] mentioned he had a simple bash script for incrementing his SemVer numbers, and posted it over. Naturally, I tweaked it to work more easily for my usecases so, this is *mostly* Steven’s code, but with a bit of a wrapper before and after by me :)

#! /bin/bash RE='[^0-9]*$[0-9]*$[.]$[0-9]*$[.]$[0-9]*$$[0-9A-Za-z-]*$' step="$1" if [ -z "$1" ] then step=patch fi base="$2" if [ -z "$2" ] then base=$(git tag 2>/dev/null| tail -n 1) if [ -z "$base" ] then base=0.0.0 fi fi MAJOR=`echo $base | sed -e "s#$RE#\1#"` MINOR=`echo $base | sed -e "s#$RE#\2#"` PATCH=`echo $base | sed -e "s#$RE#\3#"` case "$step" in major) let MAJOR+=1 let MINOR=0 let PATCH=0 ;; minor) let MINOR+=1 let PATCH=0 ;; patch) let PATCH+=1 ;; esac echo "$MAJOR.$MINOR.$PATCH"

Late Edit: 2022-11-19 ictus4u spotted that I wasn’t handling the reset of PATCH to 0 when MINOR gets a bump. I fixed this in the above gist.

So how do you use this? Dead simple, use nextver in a tree that has an existing git tag SemVer to get the next patch number. If you want to bump it to the next minor or major version, try nextver minor or nextver major. If you don’t have a git tag, and don’t specify a SemVer number, then it’ll just assume you’re starting from fresh, and return 0.0.1 :)

Want to find more cool stuff from the original author of this work? Here is a video by the author :)

Featured image is “Feb 11” by “Gordon” on Flickr and is released under a CC-BY-SA license.

Getting Started with Terraform on Azure

2019-06-122019-06-13 JonTheNiceGuy Leave a comment

I’m strongly in the “Ansible is my tool, what needs fixing” camp, when it comes to Infrastructure as Code (IaC) but, I know there are other tools out there which are equally as good. I’ve been strongly advised to take a look at Terraform from HashiCorp. I’m most familiar at the moment with Azure, so this is going to be based around resources available on Azure.

Late edit: I want to credit my colleague, Pete, for his help getting started with this. While many of the code samples have been changed from what he provided me with, if it hadn’t been for these code samples in the first place, I’d never have got started!

Late edit 2: This post was initially based on Terraform 0.11, and I was prompted by another colleague, Jon, that the available documentation still follows the 0.11 layout. 0.12 was released in May, and changes how variables are reused in the code. This post now *should* follow the 0.12 conventions, but if you spot something where it doesn’t, check out this post from the Terraform team.

As with most things, there’s a learning curve, and I struggled to find a “simple” getting started guide for Terraform. I’m sure this is a failing on my part, but I thought it wouldn’t hurt to put something out there for others to pick up and see if it helps someone else (and, if that “someone else” is you, please let me know in the comments!)

Pre-requisites

You need an Azure account for this. This part is very far outside my spectrum of influence, but I’m assuming you’ve got one. If not, look at something like Digital Ocean, AWS or VMWare :) For my “controller”, I’m using Windows Subsystem for Linux (WSL), and wrote the following notes about getting my pre-requisites.

# Pre-requsites ``` mkdir -p ~/bin cd ~/bin sudo apt update && sudo apt install unzip ``` # Install AzureCLI [Source](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli-apt) ``` curl -sL https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/microsoft.asc.gpg > /dev/null echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/azure-cli.list > /dev/null sudo apt update && sudo apt install azure-cli ``` # Install Kubectl in WSL [Source](https://kubernetes.io/docs/tasks/tools/install-kubectl/#install-kubectl-binary-with-curl-on-linux) ``` curl -sLO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl && chmod 755 kubectl ``` # Install Terraform in WSL [Source](https://techcommunity.microsoft.com/t5/Azure-Developer-Community-Blog/Configuring-Terraform-on-Windows-10-Linux-Sub-System/ba-p/393845) ``` curl -sLO $(curl https://www.terraform.io/downloads.html | grep "linux_amd64.zip" | cut -d\" -f 2) && unzip terraform*.zip && rm terraform*.zip && chmod 755 terraform ``` # Install terraform extension for VSCode https://marketplace.visualstudio.com/items?itemName=mauve.terraform # Define bash as Default VSCode Shell [Source](https://code.visualstudio.com/docs/editor/integrated-terminal) 1. ctrl+shift+p (Command palete) 1. Type: `default shell` and select `Terminal: Select Default Shell` 1. Choose "WSL Bash"

Building the file structure

One quirk with Terraform, versus other tools like Ansible, is that when you run one of the terraform commands (like terraform init, terraform plan or terraform apply), it reads the entire content of any file suffixed “tf” in that directory, so if you don’t want a file to be loaded, you need to either move it out of the directory, comment it out, or rename it so it doesn’t end .tf. By convention, you normally have three “standard” files in a terraform directory – main.tf, variables.tf and output.tf, but logically speaking, you could have everything in a single file, or each instruction in it’s own file. Because this is a relatively simple script, I’ll use this standard layout.

The actions I’ll be performing are the “standard” steps you’d perform in Azure to build a single Infrastructure as a Service (IAAS) server service:

Create your Resource Group (RG)
Create a Virtual Network (VNET)
Create a Subnet
Create a Security Group (SG) and rules
Create a Public IP address (PubIP) with a DNS name associated to that IP.
Create a Network Interface (NIC)
Create a Virtual Machine (VM), supplying a username and password, the size of disks and VM instance, and any post-provisioning instructions (yep, I’m using Ansible for that :) ).

I’m using Visual Studio Code, but almost any IDE will have integrations for Terraform. The main thing I’m using it for is auto-completion of resource, data and output types, also the fact that control+clicking resource types opens your browser to the documentation page on terraform.io.

So, creating my main.tf, I start by telling it that I’m working with the Terraform AzureRM Provider (the bit of code that can talk Azure API).

This simple statement is enough to get Terraform to load the AzureRM, but it still doesn’t tell Terraform how to get access to the Azure account. Use az login from a WSL shell session to authenticate.

Next, we create our basic resource, vnet and subnet resources.

resource "azurerm_resource_group" "rg" { name = var.resource_group_name location = var.location } resource "azurerm_virtual_network" "vnet" { name = var.vnet_name location = azurerm_resource_group.rg.location resource_group_name = azurerm_resource_group.rg.name address_space = ["10.0.0.0/16"] dns_servers = ["8.8.8.8", "8.8.4.4"] } resource "azurerm_subnet" "subnet" { name = var.subnet_name resource_group_name = azurerm_resource_group.rg.name virtual_network_name = azurerm_virtual_network.vnet.name address_prefix = "10.0.1.0/24" }

But wait, I hear you cry, what are those var.something bits in there? I mentioned before that in the “standard” set of files is a “variables.tf” file. In here, you specify values for later consumption. I have recorded variables for the resource group name and location, as well as the VNet name and subnet name. Let’s add those into variables.tf.

variable resource_group_name { default = "MFIOT201906" } variable vnet_name { default = "MFIOT201906_vnet" } variable subnet_name { default = "MFIOT201906_subnet" } variable location { default = "UK South" }

When you’ve specified a resource, you can capture any of the results from that resource to use later – either in the main.tf or in the output.tf files. By creating the resource group (called “rg” here, but you can call it anything from “demo” to “myfirstresourcegroup”), we can consume the name or location with azurerm_resource_group.rg.name and azurerm_resource_group.rg.location, and so on. In the above code, we use the VNet name in the subnet, and so on.

After the subnet is created, we can start adding the VM specific parts – a security group (with rules), a public IP (with DNS name) and a network interface. I’ll create the VM itself later. So, let’s do this.

resource "azurerm_network_security_group" "iaasnsg" { name = "iaas-nsg" location = azurerm_resource_group.rg.location resource_group_name = azurerm_resource_group.rg.name } resource "azurerm_network_security_rule" "iaasnsgr" { name = "iaas-nsg-100" priority = 100 direction = "Inbound" access = "Allow" protocol = "Tcp" source_port_range = "*" destination_port_range = "22" source_address_prefix = "${trimspace(data.http.icanhazip.body)}/32" destination_address_prefix = "*" resource_group_name = azurerm_resource_group.rg.name network_security_group_name = azurerm_network_security_group.iaasnsg.name } resource "azurerm_public_ip" "iaaspubip" { name = "iaas-pubip" location = azurerm_resource_group.rg.location resource_group_name = azurerm_resource_group.rg.name allocation_method = "Dynamic" domain_name_label = var.dns_prefix } resource "azurerm_network_interface" "iaasnic" { name = "iaas-nic" location = azurerm_resource_group.rg.location resource_group_name = azurerm_resource_group.rg.name network_security_group_id = azurerm_network_security_group.iaasnsg.id ip_configuration { name = "iaas-nic-ip" subnet_id = azurerm_subnet.subnet.id private_ip_address_allocation = "Dynamic" public_ip_address_id = azurerm_public_ip.iaaspubip.id } }

BUT WAIT, what’s that ${trimspace(data.http.icanhazip.body)}/32 bit there?? Any resources we want to load from the terraform state, but that we’ve not directly defined ourselves needs to come from somewhere. These items are classed as “data” – that is, we want to know what their values are, but we aren’t *changing* the service to get it. You can also use this to import other resource items, perhaps a virtual network that is created by another team, or perhaps your account doesn’t have the rights to create a resource group. I’ll include a commented out data block in the overall main.tf file for review that specifies a VNet if you want to see how that works.

In this case, I want to put the public IP address I’m coming from into the NSG Rule, so I can get access to the VM, without opening it up to *everyone*. I’m not that sure that my IP address won’t change between one run and the next, so I’m using the icanhazip.com service to determine my IP address. But I’ve not defined how to get that resource yet. Let’s add it to the main.tf for now.

So, we’re now ready to create our virtual machine. It’s quite a long block, but I’ll pull certain elements apart once I’ve pasted this block in.

resource "azurerm_virtual_machine" "main" { name = "iaas-vm" location = azurerm_resource_group.rg.location resource_group_name = azurerm_resource_group.rg.name network_interface_ids = [azurerm_network_interface.iaasnic.id] vm_size = "Standard_DS1_v2" delete_os_disk_on_termination = true delete_data_disks_on_termination = true storage_image_reference { publisher = "Canonical" offer = "UbuntuServer" sku = "18.04-LTS" version = "latest" } storage_os_disk { name = "iaas-os-disk" caching = "ReadWrite" create_option = "FromImage" managed_disk_type = "Standard_LRS" } os_profile { computer_name = "iaas" admin_username = var.ssh_user admin_password = var.ssh_password } os_profile_linux_config { disable_password_authentication = false } provisioner "remote-exec" { inline = ["mkdir /tmp/ansible"] connection { type = "ssh" host = azurerm_public_ip.iaaspubip.fqdn user = var.ssh_user password = var.ssh_password } } provisioner "file" { source = "ansible/" destination = "/tmp/ansible" connection { type = "ssh" host = azurerm_public_ip.iaaspubip.fqdn user = var.ssh_user password = var.ssh_password } } provisioner "remote-exec" { inline = [ "sudo apt update > /tmp/apt_update || cat /tmp/apt_update", "sudo apt install -y python3-pip > /tmp/apt_install_python3_pip || cat /tmp/apt_install_python3_pip", "sudo -H pip3 install ansible > /tmp/pip_install_ansible || cat /tmp/pip_install_ansible", "ansible-playbook /tmp/ansible/main.yml" ] connection { type = "ssh" host = azurerm_public_ip.iaaspubip.fqdn user = var.ssh_user password = var.ssh_password } } }

So, this is broken into four main pieces.

Virtual Machine Details. This part is relatively sensible. Name RG, location, NIC, Size and what happens to the disks when the machine powers on. OK.

name                             = "iaas-vm"
location                         = azurerm_resource_group.rg.location
resource_group_name              = azurerm_resource_group.rg.name
network_interface_ids            = [azurerm_network_interface.iaasnic.id]
vm_size                          = "Standard_DS1_v2"
delete_os_disk_on_termination    = true
delete_data_disks_on_termination = true

Disk details.

storage_image_reference {
  publisher = "Canonical"
  offer     = "UbuntuServer"
  sku       = "18.04-LTS"
  version   = "latest"
}
storage_os_disk {
  name              = "iaas-os-disk"
  caching           = "ReadWrite"
  create_option     = "FromImage"
  managed_disk_type = "Standard_LRS"
}

OS basics: VM Hostname, username of the first user, and it’s password. Note, if you want to use an SSH key, this must be stored for Terraform to use without passphrase. If you mention an SSH key here, as well as a password, this can cause all sorts of connection issues, so pick one or the other.

os_profile {
  computer_name  = "iaas"
  admin_username = var.ssh_user
  admin_password = var.ssh_password
}
os_profile_linux_config {
  disable_password_authentication = false
}

And lastly, provisioning. I want to use Ansible for my provisioning. In this example, I have a basic playbook stored locally on my Terraform host, which I transfer to the VM, install Ansible via pip, and then execute ansible-playbook against the file I uploaded. This could just as easily be a git repo to clone or a shell script to copy in, but this is a “simple” example.

provisioner "remote-exec" {
  inline = ["mkdir /tmp/ansible"]

  connection {
    type     = "ssh"
    host     = azurerm_public_ip.iaaspubip.fqdn
    user     = var.ssh_user
    password = var.ssh_password
  }
}

provisioner "file" {
  source = "ansible/"
  destination = "/tmp/ansible"

  connection {
    type     = "ssh"
    host     = azurerm_public_ip.iaaspubip.fqdn
    user     = var.ssh_user
    password = var.ssh_password
  }
}

provisioner "remote-exec" {
  inline = [
    "sudo apt update > /tmp/apt_update || cat /tmp/apt_update",
    "sudo apt install -y python3-pip > /tmp/apt_install_python3_pip || cat /tmp/apt_install_python3_pip",
    "sudo -H pip3 install ansible > /tmp/pip_install_ansible || cat /tmp/pip_install_ansible",
    "ansible-playbook /tmp/ansible/main.yml"
  ]

  connection {
    type     = "ssh"
    host     = azurerm_public_ip.iaaspubip.fqdn
    user     = var.ssh_user
    password = var.ssh_password
  }
}

This part of code is done in three parts – create upload path, copy the files in, and then execute it. If you don’t create the upload path, it’ll upload just the first file it comes to into the path specified.

Each remote-exec and file provisioner statement must include the hostname, username and either the password, or SSH private key. In this example, I provide just the password.

So, having created all this lot, you need to execute the terraform workload. Initially you do terraform init. This downloads all the provisioners and puts them into the same tree as these .tf files are stored in. It also resets the state of the terraform discovered or created datastore.

Next, you do terraform plan -out tfout. Technically, the tfout part can be any filename, but having something like tfout marks it as clearly part of Terraform. This creates the tfout file with the current state, and whatever needs to change in the Terraform state file on it’s next run. Typically, if you don’t use a tfout file within about 20 minutes, it’s probably worth removing it.

Finally, once you’ve run your plan stage, now you need to apply it. In this case you execute terraform apply tfout. This tfout is the same filename you specified in terraform plan. If you don’t include -out tfout on your plan (or even run a plan!) and tfout in your apply, then you can skip the terraform plan stage entirely.

When I ran this, with a handful of changes to the variable files, I got this result:

$ terraform init Initializing the backend... Initializing provider plugins... The following providers do not have any version constraints in configuration, so the latest version was installed. To prevent automatic upgrades to new major versions that may contain breaking changes, it is recommended to add version = "..." constraints to the corresponding provider blocks in configuration, with the constraint strings suggested below. * provider.azurerm: version = "~> 1.30" * provider.http: version = "~> 1.1" Terraform has been successfully initialized! You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary.

$ terraform plan -out tfout Refreshing Terraform state in-memory prior to plan... [314/486] The refreshed state will be used to calculate this plan, but will not be persisted to local or remote state storage. data.http.icanhazip: Refreshing state... ------------------------------------------------------------------------ An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # azurerm_network_interface.iaasnic will be created + resource "azurerm_network_interface" "iaasnic" { + applied_dns_servers = (known after apply) + dns_servers = (known after apply) + enable_accelerated_networking = false + enable_ip_forwarding = false + id = (known after apply) + internal_dns_name_label = (known after apply) + internal_fqdn = (known after apply) + location = "uksouth" + mac_address = (known after apply) + name = "iaas-nic" + network_security_group_id = (known after apply) + private_ip_address = (known after apply) + private_ip_addresses = (known after apply) + resource_group_name = "20190611_JS_RG" + tags = (known after apply) + virtual_machine_id = (known after apply) + ip_configuration { + application_gateway_backend_address_pools_ids = (known after apply) + application_security_group_ids = (known after apply) + load_balancer_backend_address_pools_ids = (known after apply) + load_balancer_inbound_nat_rules_ids = (known after apply) + name = "iaas-nic-ip" + primary = (known after apply) + private_ip_address_allocation = "dynamic" + private_ip_address_version = "IPv4" + public_ip_address_id = (known after apply) + subnet_id = (known after apply) } } # azurerm_network_security_group.iaasnsg will be created + resource "azurerm_network_security_group" "iaasnsg" { + id = (known after apply) + location = "uksouth" + name = "iaas-nsg" + resource_group_name = "20190611_JS_RG" + security_rule = (known after apply) + tags = (known after apply) } # azurerm_network_security_rule.iaasnsgr will be created + resource "azurerm_network_security_rule" "iaasnsgr" { + access = "Allow" + destination_address_prefix = "*" + destination_port_range = "22" + direction = "Inbound" + id = (known after apply) + name = "iaas-nsg-100" [250/486] + network_security_group_name = "iaas-nsg" + priority = 100 + protocol = "Tcp" + resource_group_name = "20190611_JS_RG" + source_address_prefix = "89.101.76.85/32" + source_port_range = "*" } # azurerm_public_ip.iaaspubip will be created + resource "azurerm_public_ip" "iaaspubip" { + allocation_method = "Dynamic" + domain_name_label = "js-20190611-iaas-demo" + fqdn = (known after apply) + id = (known after apply) + idle_timeout_in_minutes = 4 + ip_address = (known after apply) + ip_version = "IPv4" + location = "uksouth" + name = "iaas-pubip" + public_ip_address_allocation = (known after apply) + resource_group_name = "20190611_JS_RG" + sku = "Basic" + tags = (known after apply) } # azurerm_resource_group.rg will be created + resource "azurerm_resource_group" "rg" { + id = (known after apply) + location = "uksouth" + name = "20190611_JS_RG" + tags = (known after apply) } # azurerm_subnet.subnet will be created + resource "azurerm_subnet" "subnet" { + address_prefix = "10.0.1.0/24" + id = (known after apply) + ip_configurations = (known after apply) + name = "20190611_JS_subnet" + resource_group_name = "20190611_JS_RG" + virtual_network_name = "20190611_JS_vnet" } # azurerm_virtual_machine.main will be created + resource "azurerm_virtual_machine" "main" { + availability_set_id = (known after apply) + delete_data_disks_on_termination = true + delete_os_disk_on_termination = true + id = (known after apply) + license_type = (known after apply) + location = "uksouth" + name = "iaas-vm" + network_interface_ids = (known after apply) + resource_group_name = "20190611_JS_RG" + tags = (known after apply) + vm_size = "Standard_DS1_v2" + identity { + identity_ids = (known after apply) + principal_id = (known after apply) + type = (known after apply) } + os_profile { + admin_password = (sensitive value) + admin_username = "tf_admin" + computer_name = "iaas" + custom_data = (known after apply) } + os_profile_linux_config { + disable_password_authentication = false } + storage_data_disk { + caching = (known after apply) + create_option = (known after apply) + disk_size_gb = (known after apply) + lun = (known after apply) + managed_disk_id = (known after apply) + managed_disk_type = (known after apply) + name = (known after apply) + vhd_uri = (known after apply) + write_accelerator_enabled = (known after apply) } + storage_image_reference { + offer = "UbuntuServer" + publisher = "Canonical" + sku = "18.04-LTS" + version = "latest" } + storage_os_disk { + caching = "ReadWrite" + create_option = "FromImage" + disk_size_gb = (known after apply) + managed_disk_id = (known after apply) + managed_disk_type = "Standard_LRS" + name = "iaas-os-disk" + os_type = (known after apply) + write_accelerator_enabled = false } } # azurerm_virtual_network.vnet will be created [144/486] + resource "azurerm_virtual_network" "vnet" { + address_space = [ + "10.0.0.0/16", ] + dns_servers = [ + "8.8.8.8", + "8.8.4.4", ] + id = (known after apply) + location = "uksouth" + name = "20190611_JS_vnet" + resource_group_name = "20190611_JS_RG" + tags = (known after apply) + subnet { + address_prefix = (known after apply) + id = (known after apply) + name = (known after apply) + security_group = (known after apply) } } Plan: 8 to add, 0 to change, 0 to destroy. ------------------------------------------------------------------------ This plan was saved to: tfout To perform exactly these actions, run the following command to apply: terraform apply "tfout"

terraform apply tfout azurerm_resource_group.rg: Creating... azurerm_resource_group.rg: Creation complete after 1s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906] azurerm_network_security_group.iaasnsg: Creating... azurerm_virtual_network.vnet: Creating... azurerm_public_ip.iaaspubip: Creating... azurerm_public_ip.iaaspubip: Creation complete after 4s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Network/publicIPAddresses/iaas-pubip] azurerm_network_security_group.iaasnsg: Still creating... [11s elapsed] azurerm_virtual_network.vnet: Still creating... [11s elapsed] azurerm_virtual_network.vnet: Creation complete after 11s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Network/virtualNetworks/MFIOT201906_vnet] azurerm_network_security_group.iaasnsg: Creation complete after 12s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Network/networkSecurityGroups/iaas-nsg] azurerm_network_security_rule.iaasnsgr: Creating... azurerm_subnet.subnet: Creating... azurerm_network_security_rule.iaasnsgr: Creation complete after 1s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Network/networkSecurityGroups/iaas-nsg/securityRules/iaas-nsg-100] azurerm_subnet.subnet: Still creating... [10s elapsed] azurerm_subnet.subnet: Creation complete after 10s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Network/virtualNetworks/MFIOT201906_vnet/subnets/MFIOT201906_subnet] azurerm_network_interface.iaasnic: Creating... azurerm_network_interface.iaasnic: Creation complete after 0s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Network/networkInterfaces/iaas-nic] azurerm_virtual_machine.main: Creating... azurerm_virtual_machine.main: Still creating... [10s elapsed] azurerm_virtual_machine.main: Still creating... [20s elapsed] azurerm_virtual_machine.main: Still creating... [30s elapsed] azurerm_virtual_machine.main: Provisioning with 'remote-exec'... azurerm_virtual_machine.main (remote-exec): Connecting to remote host via SSH... azurerm_virtual_machine.main (remote-exec): Host: myfirstiaasonterraform.uksouth.cloudapp.azure.com azurerm_virtual_machine.main (remote-exec): User: tf_admin azurerm_virtual_machine.main (remote-exec): Password: true azurerm_virtual_machine.main (remote-exec): Private key: false azurerm_virtual_machine.main (remote-exec): Certificate: false azurerm_virtual_machine.main (remote-exec): SSH Agent: true azurerm_virtual_machine.main (remote-exec): Checking Host Key: false azurerm_virtual_machine.main (remote-exec): Connecting to remote host via SSH... azurerm_virtual_machine.main (remote-exec): Host: myfirstiaasonterraform.uksouth.cloudapp.azure.com azurerm_virtual_machine.main (remote-exec): User: tf_admin azurerm_virtual_machine.main (remote-exec): Password: true azurerm_virtual_machine.main (remote-exec): Private key: false azurerm_virtual_machine.main (remote-exec): Certificate: false azurerm_virtual_machine.main (remote-exec): SSH Agent: true azurerm_virtual_machine.main (remote-exec): Checking Host Key: false azurerm_virtual_machine.main (remote-exec): Connecting to remote host via SSH... azurerm_virtual_machine.main (remote-exec): Host: myfirstiaasonterraform.uksouth.cloudapp.azure.com azurerm_virtual_machine.main (remote-exec): User: tf_admin azurerm_virtual_machine.main (remote-exec): Password: true azurerm_virtual_machine.main (remote-exec): Private key: false azurerm_virtual_machine.main (remote-exec): Certificate: false azurerm_virtual_machine.main (remote-exec): SSH Agent: true azurerm_virtual_machine.main (remote-exec): Checking Host Key: false azurerm_virtual_machine.main: Still creating... [40s elapsed] azurerm_virtual_machine.main (remote-exec): Connecting to remote host via SSH... azurerm_virtual_machine.main (remote-exec): Host: myfirstiaasonterraform.uksouth.cloudapp.azure.com azurerm_virtual_machine.main (remote-exec): User: tf_admin azurerm_virtual_machine.main (remote-exec): Password: true azurerm_virtual_machine.main (remote-exec): Private key: false azurerm_virtual_machine.main (remote-exec): Certificate: false azurerm_virtual_machine.main (remote-exec): SSH Agent: true azurerm_virtual_machine.main (remote-exec): Checking Host Key: false azurerm_virtual_machine.main (remote-exec): Connected! azurerm_virtual_machine.main: Provisioning with 'file'... azurerm_virtual_machine.main: Provisioning with 'remote-exec'... azurerm_virtual_machine.main (remote-exec): Connecting to remote host via SSH... azurerm_virtual_machine.main (remote-exec): Host: myfirstiaasonterraform.uksouth.cloudapp.azure.com azurerm_virtual_machine.main (remote-exec): User: tf_admin azurerm_virtual_machine.main (remote-exec): Password: true azurerm_virtual_machine.main (remote-exec): Private key: false azurerm_virtual_machine.main (remote-exec): Certificate: false azurerm_virtual_machine.main (remote-exec): SSH Agent: true azurerm_virtual_machine.main (remote-exec): Checking Host Key: false azurerm_virtual_machine.main (remote-exec): Connected! azurerm_virtual_machine.main: Still creating... [50s elapsed] azurerm_virtual_machine.main (remote-exec): WARNING: apt does not have a stable CLI interface. Use with caution in scripts. azurerm_virtual_machine.main (remote-exec): WARNING: apt does not have a stable CLI interface. Use with caution in scripts. azurerm_virtual_machine.main: Still creating... [1m0s elapsed] azurerm_virtual_machine.main (remote-exec): Extracting templates from packages: 47% azurerm_virtual_machine.main (remote-exec): Extracting templates from packages: 95% azurerm_virtual_machine.main (remote-exec): Extracting templates from packages: 100% azurerm_virtual_machine.main: Still creating... [1m10s elapsed] azurerm_virtual_machine.main: Still creating... [1m20s elapsed] azurerm_virtual_machine.main: Still creating... [1m30s elapsed] azurerm_virtual_machine.main: Still creating... [1m40s elapsed] azurerm_virtual_machine.main: Still creating... [1m50s elapsed] azurerm_virtual_machine.main: Still creating... [2m0s elapsed] azurerm_virtual_machine.main: Still creating... [2m10s elapsed] azurerm_virtual_machine.main: Still creating... [2m20s elapsed] azurerm_virtual_machine.main: Still creating... [2m30s elapsed] azurerm_virtual_machine.main: Still creating... [2m40s elapsed] azurerm_virtual_machine.main (remote-exec): [WARNING]: No inventory was parsed, only implicit localhost is available azurerm_virtual_machine.main (remote-exec): azurerm_virtual_machine.main (remote-exec): [WARNING]: provided hosts list is empty, only localhost is available. Note azurerm_virtual_machine.main (remote-exec): that the implicit localhost does not match 'all' azurerm_virtual_machine.main (remote-exec): azurerm_virtual_machine.main (remote-exec): PLAY [localhost] *************************************************************** azurerm_virtual_machine.main (remote-exec): TASK [debug] ******************************************************************* azurerm_virtual_machine.main (remote-exec): ok: [localhost] => { azurerm_virtual_machine.main (remote-exec): "msg": "Hello world!" azurerm_virtual_machine.main (remote-exec): } azurerm_virtual_machine.main (remote-exec): PLAY RECAP ********************************************************************* azurerm_virtual_machine.main (remote-exec): localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 azurerm_virtual_machine.main: Creation complete after 2m50s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Compute/virtualMachines/iaas-vm] Apply complete! Resources: 8 added, 0 changed, 0 destroyed. The state of your infrastructure has been saved to the path below. This state is required to modify and destroy your infrastructure, so keep it safe. To inspect the complete state use the `terraform show` command. State path: terraform.tfstate Outputs: host = myfirstiaasonterraform.uksouth.cloudapp.azure.com

Once you’re done with your environment, use terraform destroy to shut it all down… and enjoy :)

The full source is available in the associated Gist. Pull requests and constructive criticism are very welcome!

Featured image is “Seca” by “Olearys” on Flickr and is released under a CC-BY license.

Building a Gitlab and Ansible Tower (AWX) Demo in Vagrant with Ansible

2019-05-162019-05-16 JonTheNiceGuy Leave a comment

TL;DR – I created a repository on GitHub ‌ containing a Vagrantfile and an Ansible Playbook to build a VM running Docker. That VM hosts AWX (Ansible Tower’s upstream open-source project) and Gitlab.

A couple of years ago, a colleague created (and I enhanced) a Vagrant and Ansible playbook called “Project X” which would run an AWX instance in a Virtual Machine. It’s a bit heavy, and did a lot of things to do with persistence that I really didn’t need, so I parked my changes and kept an eye on his playbook…

Fast-forward to a week-or-so ago. I needed to explain what a Git/Ansible Workflow would look like, and so I went back to look at ProjectX. Oh my, it looks very complex and consumed a lot of roles that, historically, I’ve not been that impressed with… I just needed the basics to run AWX. Oh, and I also needed a Gitlab environment.

I knew that Gitlab had a docker-based install, and so does AWX, so I trundled off to find some install guides. These are listed in the playbook I eventually created (hence not listing them here). Not all the choices I made were inspired by those guides – I wanted to make quite a bit of this stuff “build itself”… this meant I wanted users, groups and projects to be created in Gitlab, and users, projects, organisations, inventories and credentials to be created in AWX.

I knew that you can create Docker Containers in Ansible, so after I’d got my pre-requisites built (full upgrade, docker installed, pip libraries installed), I add the gitlab-ce :latest docker image, and expose some ports. Even now, I’m not getting the SSH port mapped that I was expecting, but … it’s no disaster.

I did notice that the Gitlab service takes ages to start once the container is marked as running, so I did some more digging, and found that the uri module can be used to poll a URL. It wasn’t documented well how you can make it keep polling until you get the response you want, so … I added a PR on the Ansible project’s github repo for that one (and I also wrote a blog post about that earlier too).

Once I had a working Gitlab service, I needed to customize it. There are a bunch of Gitlab modules in Ansible but since a few releases back of Gitlab, these don’t work any more, so I had to find a different way. That different way was to run an internal command called “gitlab-rails”. It’s not perfect (so it doesn’t create repos in your projects) but it’s pretty good at giving you just enough to build your demo environment. So that’s getting Gitlab up…

Now I need to build AWX. There’s lots of build guides for this, but actually I had most luck using the README in their repository (I know, who’d have thought it!??!) There are some “Secrets” that should be changed in production that I’m changing in my script, but on the whole, it’s pretty much a vanilla install.

Unlike the Gitlab modules, the Ansible Tower modules all work, so I use these to create the users, credentials and so-on. Like the gitlab-rails commands, however, the documentation for using the tower modules is pretty ropey, and I still don’t have things like “getting your users to have access to your organisation” working from the get-go, but for the bulk of the administration, it does “just work”.

Like all my playbooks, I use group_vars to define the stuff I don’t want to keep repeating. In this demo, I’ve set all the passwords to “Passw0rd”, and I’ve created 3 users in both AWX and Gitlab – csa, ops and release – indicative of the sorts of people this demo I ran was aimed at – Architects, Operations and Release Managers.

Maybe, one day, I’ll even be able to release the presentation that went with the demo ;)

On a more productive note, if you’re doing things with the tower_ modules and want to tell me what I need to fix up, or if you’re doing awesome things with the gitlab-rails tool, please visit the repo with this automation code in, and take a look at some of my “todo” items! Thanks!!

Featured image is “Tower” by “Yijun Chen” on Flickr and is released under a CC-BY-SA license.

"www.GetIPv6.info decal" from Phil Wolff on Flickr

Hurricane Electric IPv6 Gateway on Raspbian for Raspberry Pi

2019-01-212019-03-12 JonTheNiceGuy Leave a comment

NOTE: This article was replaced on 2019-03-12 by a github repository where I now use Vagrant instead of a Raspberry Pi, because I was having some power issues with my Raspberry Pi. Also, using this method means I can easily use an Ansible Playbook. The following config will still work(!) however I prefer this Vagrant/Ansible workflow for this, so won’t update this blog post any further.

Following an off-hand remark from a colleague at work, I decided I wanted to set up a Raspberry Pi as a Hurricane Electric IPv6 6in4 tunnel router. Most of the advice around (in particular, this post about setting up IPv6 on the Raspberry Pi Forums) related to earlier version of Raspbian, so I thought I’d bring it up-to-date.

I installed the latest available version of Raspbian Stretch Lite (2018-11-13) and transferred it to a MicroSD card. I added the file ssh to the boot volume and unmounted it. I then fitted it into my Raspberry Pi, and booted it. While it was booting, I set a static IPv4 address on my router (192.168.1.252) for the Raspberry Pi, so I knew what IP address it would be on my network.

I logged into my Hurricane Electric (HE) account at tunnelbroker.net and created a new tunnel, specifying my public IP address, and selecting my closest HE endpoint. When the new tunnel was created, I went to the “Example Configurations” tab, and selected “Debian/Ubuntu” from the list of available OS options. I copied this configuration into my clipboard.

I SSH’d into the Pi, and gave it a basic config (changed the password, expanded the disk, turned off “predictable network names”, etc) and then rebooted it.

After this was done, I created a file in /etc/network/interfaces.d/he-ipv6 and pasted in the config from the HE website. I had to change the “local” line from the public IP I’d provided HE with, to the real IP address of this box. Note that any public IPs (that is, not 192.168.x.x addresses) in the config files and settings I’ve noted refer to documentation addressing (TEST-NET-2 and the IPv6 documentation address ranges)

auto he-ipv6
iface he-ipv6 inet6 v4tunnel
        address 2001:db8:123c:abd::2
        netmask 64
        endpoint 198.51.100.100
        local 192.168.1.252
        ttl 255
        gateway 2001:db8:123c:abd::1

Next, I created a file in /etc/network/interfaces.d/eth0 and put the following configuration in, using the first IPv6 address in the “routed /64” range listed on the HE site:

auto eth0
iface eth0 inet static
    address 192.168.1.252
    gateway 192.168.1.254
    netmask 24
    dns-nameserver 8.8.8.8
    dns-nameserver 8.8.4.4

iface eth0 inet6 static
    address 2001:db8:123d:abc::1
    netmask 64

Next, I disabled the DHCPd service by issuing systemctl stop dhcpcd.service Late edit (2019-01-22): Note, a colleague mentioned that this should have actually been systemctl stop dhcpcd.service && systemctl disable dhcpcd.service – good spot! Thanks!! This ensures that if, for some crazy reason, the router stops offering the right DHCP address to me, I can still access this box on this IP. Huzzah!

I accessed another host which had IPv6 access, and performed both a ping and an SSH attempt. Both worked. Fab. However, this now needs to be blocked, as we shouldn’t permit anything to be visible downstream from this gateway.

I’m using the Uncomplicated Firewall (ufw) which is a simple wrapper around IPTables. Let’s create our policy.

# First install the software
sudo apt update && sudo apt install ufw -y

# Permits inbound IPv4 SSH to this host - which should be internal only. 
# These rules allow tailored access in to our managed services
ufw allow in on eth0 app DNS
ufw allow in on eth0 app OpenSSH

# These rules accept all broadcast and multicast traffic
ufw allow in on eth0 to 224.0.0.0/4 # Multicast addresses
ufw allow in on eth0 to 255.255.255.255 # Global broadcast
ufw allow in on eth0 to 192.168.1.255 # Local broadcast

# Alternatively, accept everything coming in on eth0
# If you do this one, you don't need the lines above
ufw allow in on eth0

# Setup the default rules - deny inbound and routed, permit outbound
ufw default deny incoming 
ufw default deny routed
ufw default allow outgoing

# Prevent inbound IPv6 to the network
# Also, log any drops so we can spot them if we have an issue
ufw route deny log from ::/0 to 2001:db8:123d:abc::/64

# Permit outbound IPv6 from the network
ufw route allow from 2001:db8:123d:abc::/64

# Start the firewall!
ufw enable

# Check the policy
ufw status verbose
ufw status numbered

Most of the documentation I found suggested running radvd for IPv6 address allocation. This basically just allocates on a random basis, and, as far as I can make out, each renewal gives the host a new IPv6 address. To make that work, I performed apt-get update && apt-get install radvd -y and then created this file as /etc/radvd.conf. If all you want is a floating IP address with no static assignment – this will do it…

interface eth0
{
    AdvSendAdvert on;
    MinRtrAdvInterval 3;
    MaxRtrAdvInterval 10;
    prefix 2001:db8:123d:abc::/64
    {
        AdvOnLink on;
        AdvAutonomous on;
    };
   route ::/0 {
   };
};

However, this doesn’t give me the ability to statically assign IPv6 addresses to hosts. I found that a different IPv6 allocation method will do static addressing, based on your MAC address called SLAAC (note there are some privacy issues with this, but I’m OK with them for now…) In this mode assuming the prefix as before – 2001:db8:123d:abc:: and a MAC address of de:ad:be:ef:01:23, your IPv6 address will be something like: 2001:db8:123d:abc:dead:beff:feef:0123and this will be repeatably so – because you’re unlikely to change your MAC address (hopefully!!).

This SLAAC allocation mode is available in DNSMasq, which I’ve consumed before (in a Pi-Hole). To use this, I installed DNSMasq with apt-get update && apt-get install dnsmasq -y and then configured it as follows:

interface=eth0
listen-address=127.0.0.1
# DHCPv6 - Hurricane Electric Resolver and Google's
dhcp-option=option6:dns-server,[2001:470:20::2],[2001:4860:4860::8888]
# IPv6 DHCP scope
dhcp-range=2001:db8:123d:abc::, slaac

I decided to move from using my router as a DHCP server, to using this same host, so expanded that config as follows, based on several posts, but mostly centred around the MAN page (I’m happy to have this DNSMasq config improved if you’ve got any suggestions ;) )

# Stuff for DNS resolution
domain-needed
bogus-priv
no-resolv
filterwin2k
expand-hosts
domain=localnet
local=/localnet/
log-queries

# Global options
interface=eth0
listen-address=127.0.0.1

# Set these hosts as the DNS server for your network
# Hurricane Electric and Google
dhcp-option=option6:dns-server,[2001:470:20::2],2001:4860:4860::8888]

# My DNS servers are:
server=1.1.1.1                # Cloudflare's DNS server
server=8.8.8.8                # Google's DNS server

# IPv4 DHCP scope
dhcp-range=192.168.1.10,192.168.1.210,12h
# IPv6 DHCP scope
dhcp-range=2001:db8:123d:abc::, slaac

# Record the DHCP leases here
dhcp-leasefile=/run/dnsmasq/dhcp-lease

# DHCPv4 Router
dhcp-option=3,192.168.1.254

So, that’s what I’m doing now! Hope it helps you!

Late edit (2019-01-22): In issue 129 of the “Awesome Self Hosted Newsletter“, I found a post called “My New Years Resolution: Learn IPv6“… which uses a pfSense box and a Hurricane Electric tunnel too. Fab!

Header image is “www.GetIPv6.info decal” by “Phil Wolff” on Flickr and is released under a CC-BY-SA license. Used with thanks!

"Zenith Z-19 Terminal" from ajmexico on Flickr

Some things I learned this week while coding extensions to Ansible!

2019-01-182019-01-21 JonTheNiceGuy Leave a comment

If you follow any of the content I post around the internet, you might have seen me asking questions about trying to get data out of azure_rm_*_facts into something that’s usable. I can’t go into why I needed that data yet (it’s a little project I’m working on), but the upshot is that trying to manipulate data using “set_fact” with jinja is *doable* but *messy*. In the end, I decided to hand it all off to a new ansible module I’m writing. So, here are the things I learned about this.

There’s lots more documentation about writing a module (a plugin that let’s you do stuff) than there is about writing filters (things that change text inline) or lookups (things that let you search other data stores). In the end, while I could have spent the time to figure out how better to write a filter or a lookup, it actually makes more sense in my context to hand a module all my data, and say “Parse this” and register the result than it would have done to have the playbook constantly check whether things were in other things. I still might have to do that, but… you know, for now, I’ve got the bits I want! :)
I did start looking at writing a filter, and discovered that the “debugging advice” on the ansible site is all geared up to enable more modules than enabling filters… but I did discover that modules execute on their target (e.g. WebHost01) while filters and lookups execute on the local machine. Why does this matter? Well…..
While I was looking for documentation about debugging Ansible code, I stumbled over this page on debugging modules that makes it all look easy. Except, it’s only for debugging *MODULES* (very frustrating. Well, what does it actually mean? The modules get zipped up and sent to the host that will be executing the code, which means that with an extra flag to your playbook (ANSIBLE_KEEP_REMOTE_FILES – even if it’s going to be run on “localhost”), you get the combined output of the script placed into a path on your machine, which means you can debug that specific play. That doesn’t work for filters…
SOO, I jumped into #ansible on Freenode and asked for help. They in turn couldn’t help me (it’s more about writing playbooks than writing filters, modules, etc), so they directed me to #ansible-devel, where I was advised to use a python library called “q” (Edit, same day: my friend @mohclips pointed me to this youtube video from 2003 of the guy who wrote q explaining about it. Thanks Nick! I learned something *else* about this library).
Oh man, this is the motherlode. So, q makes life *VERY* easy. Assuming this is valid code: All you’d need to do would be to add two lines, as you’ll see here: This then dumps the output from each of the q(something) lines into /tmp/q for you to read at your leisure! (To be fair, I’d probably remove it after you’ve finished, so you don’t fill a disk :) )
And that’s when I discovered that it’s actually easier to use q() for all my python debugging purposes than it is to follow the advice above about debugging modules. Yehr, it’s basically a load of print statements, so you don’t get to see stack traces, or read all the variables, and you don’t get to step through code to see why decisions were taken… but for the rubbish code I produce, it’s easily enough for me!

Header image is “Zenith Z-19 Terminal” by “ajmexico” on Flickr and is released under a CC-BY license. Used with thanks!

"LEGO Factory Playset" from Brickset on Flickr

Building Azure Environments in Ansible

2018-12-212018-12-21 JonTheNiceGuy Leave a comment

Recently, I’ve been migrating my POV (proof of value) and POC (proof of concept) environment from K5 to Azure to be able to test vendor products inside Azure. I ran a few tests to build the environment using the native tools (the powershell scripts) and found that the Powershell way of delivering Azure environments seems overly complicated… particularly as I’m comfortable with how Ansible works.

To be fair, I also need to look at Terraform, but that isn’t what I’m looking at today :)

So, let’s start with the scaffolding. Any Ansible Playbook which deals with creating virtual machines needs to have some extra modules installed. Make sure you’ve got ansible 2.7 or later and the python azure library 2.0.0 or later (you can get both with pip for python).

Next, let’s look at the group_vars for this playbook.

--- project_prefix: env01 project_breakglass: "My$uper$ecretPassw0rd" project_location: uksouth image: ubuntu1804: offer: UbuntuServer publisher: Canonical sku: '18.04-LTS' version: latest provision_script: userdata/ubuntu.j2 fortigate603_payg: offer: fortinet_fortigate-vm_v5 publisher: fortinet sku: 'fortinet_fg-vm_payg' version: 6.0.3 plan: yes provision_script: userdata/fortigate.j2 network: wan: name: wan prefix: 10.10.10 protected: name: protected prefix: 10.10.20 gateway: 10.10.20.4 appliances: fw01: name: fw01 image: "{{ image.fortigate603_payg }}" size: Standard_F1 priority: 1 ports: - subnet: "{{ network.wan }}" - subnet: "{{ network.protected }}" private_ip: "{{ network.wan.prefix }}.4" public: false default_allow_policy: - source: port2 destination: port1 nat: yes vm01: name: vm01 image: "{{ image.ubuntu1804 }}" size: Standard_B1s priority: 2 ports: - subnet: "{{ network.protected }}" private_ip: "{{ network.protected.prefix }}.5" public: false vm02: name: vm02 image: "{{ image.ubuntu1804 }}" size: Standard_B1s priority: 0 ports: - subnet: "{{ network.wan }}" private_ip: "{{ network.wan.prefix }}.250" public: false

This file has several pieces. We define the project settings (anything prefixed project_ is a project setting), including the prefix used for all resources we create (in this case “env01“), and a standard password used for all VMs we create (in this case “My$uper$ecret$Passw0rd“).

Next we define the standard images to load from the Marketplace. You can extend this with other images, these are just the “easiest” ones that I’m most familiar with (your mileage may vary). Next up is the networks to build inside the VNet, and lastly we define the actual machines we want to build. If you’ve got questions about any of the values we define here, just let me know in the comments below :)

Next, we’ll start looking at the playbook (this has been exploded out – the full playbook is also in the gist).

vars: location: "{{ project_location|default('uksouth') }}" prefix: | {%- if lookup('env', 'ANSIBLE_PREFIX') != '' -%} {{ lookup('env', 'ANSIBLE_PREFIX') }} {%- else -%} {{ project_prefix|default(ansible_date_time.epoch|hash('md5')|truncate(8, true, '')) }} {%- endif %}" breakglass_pw: | {%- if lookup('env', 'BREAKGLASS') != '' -%} {{ lookup('env', 'BREAKGLASS') }} {%- else -%} {{ project_breakglass|default(ansible_date_time.epoch|hash('md5')|truncate(8, true, '') + '_Azure') }} {%- endif -%} tasks: - debug: var=prefix - debug: var=breakglass_pw

Here we start by pulling in the variables we might want to override, and we do this by reading system environment variables (ANSIBLE_PREFIX and BREAKGLASS) and using them if they’re set. If they’re not, use the project defaults, and if that hasn’t been set, use some pre-defined values… and then tell us what they are when we’re running the tasks (those are the debug: lines).

- name: "Create Resource Group" azure_rm_resourcegroup: name: "{{ prefix }}" location: "{{ location }}" # Yes, this is not good practice, but it's just for a POV/POC - name: "Create Any-Any Allow Security Group" azure_rm_securitygroup: resource_group: "{{ prefix }}" name: "{{ prefix }}AnyAnyAllow" rules: - name: AllowAllInbound protocol: "*" destination_port_range: "*" access: Allow priority: 100 direction: Inbound - name: AllowAllOutbound protocol: "*" destination_port_range: "*" access: Allow priority: 100 direction: Outbound - name: "Create Storage Account" azure_rm_storageaccount: resource_group: "{{ prefix }}" name: "{{ prefix }}storage" account_type: Standard_LRS - name: "Create VNET" azure_rm_virtualnetwork: resource_group: "{{ prefix }}" name: "{{ prefix }}network01" address_prefixes_cidr: "10.0.0.0/8"

This block is where we create our “Static Assets” – individual items that we will be consuming later. This shows a clear win here over the Powershell methods endorsed by Microsoft – here you can create a Resource Group (RG) as part of the playbook! We also create a single Storage Account for this RG and a single VNET too.

These creation rules are not suitable for production use, as this defines an “Any-Any” Security group! You should tailor your security groups for your need, not for blanket access in!

- name: "Schedule UDR Creation" azure_rm_routetable: resource_group: "{{ prefix }}" name: "{{ prefix }}{{ item.value.name }}_udr" with_dict: "{{ network }}" loop_control: label: "{{ prefix }}{{ item.value.name }}" when: item.value.gateway is defined async: 1000 poll: 0 register: sleeper - name: "Check UDRs Created" async_status: jid: "{{ item.ansible_job_id }}" register: sleeper_status until: sleeper_status.finished retries: 500 delay: 4 with_items: "{{ sleeper.results }}" when: item.ansible_job_id is defined loop_control: label: "{{ item._ansible_item_label }}"

This is where things start to get a bit more interesting – We’re using the “async/async_status” pattern here (and the rest of these sections) to start creating the resources in parallel. As far as I can tell, sometimes you’ll get a case where the async doesn’t quite get set up fast enough, then the async_status can’t track the resources properly, but re-running the playbook should be enough to sort that out, without slowing things down too much.

But what are we actually doing with this block of code? A UDR is a “User Defined Route” or routing table for Azure. Effectively, you treat each network interface as being plumbed directly to the router (none of this “same subnet broadcast” stuff works here!) so you can do routing at the router for all the networks.

By default there are some existing network routes (stuff to the internet flows to the internet, RFC1918 addresses are dropped with the exception of any RFC1918 addresses you have covered in your VNETs, and each of your subnets can reach each other “directly”). Adding a UDR overrides this routing table. The UDRs we’re creating here are applied at a subnet level, but currently don’t override any of the existing routes (they’re blank). We’ll start putting routes in after we’ve added the UDRs to the subnets. Talking of which….

- name: "Schedule Subnet Creation" azure_rm_subnet: resource_group: "{{ prefix }}" name: "{{ prefix }}{{ item.value.name }}" address_prefix: "{{ item.value.prefix }}.0/24" virtual_network: "{{ prefix }}network01" route_table: | {%- if item.value.gateway is defined -%} {{ prefix }}{{ item.value.name }}_udr {%- else -%} {{ omit }}{%- endif -%} with_dict: "{{ network }}" loop_control: label: "{{ prefix }}{{ item.value.name }}" async: 1000 poll: 0 register: sleeper - name: "Check Subnet Created" async_status: jid: "{{ item.ansible_job_id }}" register: sleeper_status until: sleeper_status.finished retries: 500 delay: 4 with_items: "{{ sleeper.results }}" when: item.ansible_job_id is defined loop_control: label: "{{ item._ansible_item_label }}"

Again, this block is not really suitable for production use, and assumes the VNET supernet of /8 will be broken down into several /24’s. In the “real world” you might deliver a handful of /26’s in a /24 VNET… or you might even have lots of disparate /24’s in the VNET which are then allocated exactly as individual /24 subnets… this is not what this model delivers but you might wish to investigate further!

- name: "Schedule Route Creation" azure_rm_route: resource_group: "{{ prefix }}" name: "{{ prefix }}{{ item.value.name }}" address_prefix: "0.0.0.0/0" route_table_name: "{{ prefix }}{{ item.value.name }}_udr" next_hop_type: virtual_appliance next_hop_ip_address: "{{ item.value.gateway }}" with_dict: "{{ network }}" loop_control: label: "{{ prefix }}{{ item.value.name }}" when: item.value.gateway is defined async: 1000 poll: 0 register: sleeper - name: "Check Routes Created" async_status: jid: "{{ item.ansible_job_id }}" register: sleeper_status until: sleeper_status.finished retries: 500 delay: 4 with_items: "{{ sleeper.results }}" when: item.ansible_job_id is defined loop_control: label: "{{ item._ansible_item_label }}"

Now that we’ve created our subnets, we can start adding the routing table to the UDR. This is a basic one – add a 0.0.0.0/0 route (internet access) from the “protected” network via the firewall. You can get a lot more specific than this – most people are likely to want to add the VNET range (in this case 10.0.0.0/8) via the firewall as well, except for this subnet (because otherwise, for example, 10.0.0.100 trying to reach 10.0.0.101 will go via the firewall too).

Without going too much into the intricacies of network architecture, if you are routing your traffic between subnets to the firewall, it’s probably better to get an appliance with more interfaces, so you can route traffic across the appliance, rather than going across a single interface as this will halve your traffic bandwidth (it’s currently capped 1Gb/s – so 500Mb/s).

Having mentioned “The Internet” – let’s give our firewall a public IP address, and create the rest of the interfaces as well.

- name: "Schedule Public IP creation" azure_rm_publicipaddress: resource_group: "{{ prefix }}" allocation_method: Static name: "{{ prefix }}{{ item.0.name }}port{{ item.1.subnet.name }}pubip" with_subelements: - "{{ appliances }}" - ports when: item.1.public is not defined or (item.1.public is defined and item.1.public == 'true') loop_control: label: "{{ prefix }}{{ item.0.name }}port{{ item.1.subnet.name }}pubip" async: 1000 poll: 0 register: sleeper - name: "Check Public IPs Created" async_status: jid: "{{ item.ansible_job_id }}" register: sleeper_status until: sleeper_status.finished retries: 500 delay: 4 with_items: "{{ sleeper.results }}" when: item.ansible_job_id is defined loop_control: label: "{{ item._ansible_item_label }}" - name: "Schedule Network Interface Creation" azure_rm_networkinterface: resource_group: "{{ prefix }}" name: "{{ prefix }}{{ item.0.name }}port{{ item.1.subnet.name }}" virtual_network: "{{ prefix }}network01" subnet: "{{ prefix }}{{ item.1.subnet.name }}" public_ip_name: "{{ prefix }}pubip01" security_group: "{{ prefix }}AnyAnyAllow" ip_configurations: - name: "{{ prefix }}{{ item.0.name }}port{{ item.1.subnet.name }}" public_ip_address_name: | {%- if item.1.public is not defined or (item.1.public is defined and item.1.public == 'true') -%} {{ prefix }}{{ item.0.name }}port{{ item.1.subnet.name }}pubip {%- else -%}{%- endif -%}" primary: yes with_subelements: - "{{ appliances }}" - ports loop_control: label: "{{ prefix }}{{ item.0.name }}port{{ item.1.subnet.name }}" async: 1000 poll: 0 register: sleeper - name: "Check Network Interfaces Created" async_status: jid: "{{ item.ansible_job_id }}" register: sleeper_status until: sleeper_status.finished retries: 500 delay: 4 with_items: "{{ sleeper.results }}" when: item.ansible_job_id is defined loop_control: label: "{{ item._ansible_item_label }}"

This script creates a public IP address by default for each interface unless you explicitly tell it not to (see lines 40, 53 and 62 in the group_vars file I rendered above). You could easily turn this around by changing the lines which contain this:

item.1.public is not defined or (item.1.public is defined and item.1.public == 'true')

into lines which contain this:

item.1.public is defined and item.1.public == 'true'

OK, having done all that, we’re now ready to build our virtual machines. I’ve introduced a “Priority system” here – VMs with priority 0 go first, then 1, and 2 go last. The code snippet below is just for priority 0, but you can easily see how you’d extrapolate that out (and in fact, the full code sample does just that).

- name: "Schedule Virtual Machine Creation - Priority 0 (or 'All')" azure_rm_virtualmachine: resource_group: "{{ prefix }}" name: "{{ prefix }}{{ item.value.name }}" vm_size: "{{ item.value.size }}" storage_account: "{{ prefix }}storage" storage_container: "{{ prefix }}{{ item.value.name }}os" storage_blob: "{{ prefix }}{{ item.value.name }}os.vhd" admin_username: breakglass admin_password: "{{ breakglass_pw }}" network_interfaces: | [ {%- for nw in item.value.ports -%} '{{ prefix }}{{ item.value.name }}port{{ nw.subnet.name }}' {%- if not loop.last -%}, {%- endif -%} {%- endfor -%} ] custom_data: | {%- if item.value.provision_script is defined and item.value.provision_script != '' -%} {%- include(item.value.provision_script) -%} {%- elif item.value.image.provision_script is defined and item.value.image.provision_script != '' -%} {%- include(item.value.image.provision_script) -%} {%- else -%} {{ omit }} {%- endif -%} image: publisher: "{{ item.value.image.publisher }}" offer: "{{ item.value.image.offer }}" sku: "{{ item.value.image.sku }}" version: "{{ item.value.image.version }}" plan: | {%- if item.value.image.plan is not defined -%}{{ omit }}{%- else -%} {'name': '{{ item.value.image.sku }}', 'publisher': '{{ item.value.image.publisher }}', 'product': '{{ item.value.image.offer }}' } {%- endif -%} with_dict: "{{ appliances }}" when: item.value.priority | default(0) == 0 loop_control: label: "{{ prefix }}{{ item.value.name }}" async: 1000 poll: 0 register: sleeper - name: "Check Virtual Machines Created - Priority 0" async_status: jid: "{{ item.ansible_job_id }}" register: sleeper_status until: sleeper_status.finished retries: 500 delay: 4 with_items: "{{ sleeper.results }}" when: item.ansible_job_id is defined loop_control: label: "{{ item._ansible_item_label }}"

There are a few blocks here to draw attention to :) I’ve re-jigged them a bit here so it’s clearer to understand, but when you see them in the main playbook they’re a bit more compact. Let’s start with looking at the Network Interfaces section!

network_interfaces: |
  [
    {%- for nw in item.value.ports -%}
      '{{ prefix }}{{ item.value.name }}port{{ nw.subnet.name }}'
      {%- if not loop.last -%}, {%- endif -%} 
    {%- endfor -%}
  ]

In this part, we loop over the ports defined for the virtual machine. This is because one device may have 1 interface, or four interfaces. YAML is parsed to make a JSON variable, so here we can create a JSON variable, that when the YAML is parsed it will just drop in. We’ve previously created all the interfaces to have names like this PREFIXhostnamePORTsubnetname (or aFW01portWAN in more conventional terms), so here we construct a JSON array, like this: ['aFW01portWAN'] but that could just as easily have been ['aFW01portWAN', 'aFW01portProtect', 'aFW01portMGMT', 'aFW01portSync']. This will then attach those interfaces to the virtual machine.

Next up, custom_data. This section is sometimes known externally as userdata or config_disk. My code has always referred to it as a “Provision Script” – hence the variable name in the code below!

custom_data: |
  {%- if item.value.provision_script is defined and item.value.provision_script != '' -%}
    {%- include(item.value.provision_script) -%}
  {%- elif item.value.image.provision_script is defined and item.value.image.provision_script != '' -%}
    {%- include(item.value.image.provision_script) -%}
  {%- else -%}
    {{ omit }}
  {%- endif -%}

Let’s pick this one apart too. If we’ve defined a provisioning script file for the VM, include it, if we’ve defined a provisioning script file for the image (or marketplace entry), then include that instead… otherwise, pretend that there’s no “custom_data” field before you submit this to Azure.

One last quirk to Azure, is that some images require a “plan” to go with it, and others don’t.

plan: |
  {%- if item.value.image.plan is not defined -%}{{ omit }}{%- else -%}
    {'name': '{{ item.value.image.sku }}',
     'publisher': '{{ item.value.image.publisher }}',
     'product': '{{ item.value.image.offer }}'
    }
  {%- endif -%}

So, here we say “if we’ve not got a plan, omit the value being passed to Azure, otherwise use these fields we previously specified. Weird huh?

The very last thing we do in the script is to re-render the standard password we’ve used for all these builds, so that we can check them out!

Want to review this all in one place?

Here’s the link to the full playbook, as well as the group variables (which should be in ./group_vars/all.yml) and two sample userdata files (which should be in ./userdata) for an Ubuntu machine (using cloud-init) and one for a FortiGate Firewall.

All the other files in that gist (prefixes from 10-16 and 00) are for this blog post only, and aren’t likely to work!

If you do end up using this, please drop me a note below, or star the gist! That’d be awesome!!

Image credit: “Lego Factory Playset” from Flickr by “Brickset” released under a CC-BY license. Used with Thanks!

Video Summary – Matrix Bridging to IRC, Slack and Telegram

2018-05-232018-05-23 JonTheNiceGuy Leave a comment

I made a video about bridging chat networks (IRC, Slack and Telegram) using another chat network – Matrix.

I hope you find it useful!

Defining Networks with Ansible

2018-03-212018-03-21 JonTheNiceGuy Leave a comment

In my day job, I’m using Ansible to provision networks in OpenStack. One of the complaints I’ve had about the way I now define them is that the person implementing the network has to spell out all the network elements – the subnet size, DHCP pool, the addresses of the firewalls and names of those items. This works for a manual implementation process, but is seriously broken when you try to hand that over to someone else to implement. Most people just want something which says “Here is the network I want to implement – 192.0.2.0/24″… and let the system make it for you.

So, I wrote some code to make that happen. It’s not perfect, and it’s not what’s in production (we have lots more things I need to add for that!) but it should do OK with an IPv4 network.

Hope this makes sense!

--- - hosts: localhost vars: - networks: # Defined as a subnet with specific router and firewall addressing external: subnet: "192.0.2.0/24" firewall: "192.0.2.1" router: "192.0.2.254" # Defined as an IP address and CIDR prefix, rather than a proper network address and CIDR prefix internal_1: subnet: "198.51.100.64/24" # A valid smaller network and CIDR prefix internal_2: subnet: "203.0.113.0/27" # A tiny CIDR network internal_3: subnet: "203.0.113.64/30" # These two CIDR networks are unusable for this environment internal_4: subnet: "203.0.113.128/31" internal_5: subnet: "203.0.113.192/32" # A massive CIDR network internal_6: subnet: "10.0.0.0/8" tasks: # Based on https://stackoverflow.com/a/47631963/5738 with serious help from mgedmin and apollo13 via #ansible on Freenode - name: Add router and firewall addressing for CIDR prefixes < 30 set_fact: networks: > {{ networks | default({}) | combine( {item.key: { 'subnet': item.value.subnet | ipv4('network'), 'router': item.value.router | default((( item.value.subnet | ipv4('network') | ipv4('int') ) + 1) | ipv4), 'firewall': item.value.firewall | default((( item.value.subnet | ipv4('broadcast') | ipv4('int') ) - 1) | ipv4), 'dhcp_start': item.value.dhcp_start | default((( item.value.subnet | ipv4('network') | ipv4('int') ) + 2) | ipv4), 'dhcp_end': item.value.dhcp_end | default((( item.value.subnet | ipv4('broadcast') | ipv4('int') ) - 2) | ipv4) } }) }} with_dict: "{{ networks }}" when: item.value.subnet | ipv4('prefix') < 30 - name: Add router and firewall addressing for CIDR prefixes = 30 set_fact: networks: > {{ networks | default({}) | combine( {item.key: { 'subnet': item.value.subnet | ipv4('network'), 'router': item.value.router | default((( item.value.subnet | ipv4('network') | ipv4('int') ) + 1) | ipv4), 'firewall': item.value.firewall | default((( item.value.subnet | ipv4('broadcast') | ipv4('int') ) - 1) | ipv4) } }) }} with_dict: "{{ networks }}" when: item.value.subnet | ipv4('prefix') == 30 - debug: var: networks

There and Back Again/How The Internet Works

Automation in an Infrastructure as Code World

Decentralised Social Media? – Secure Scuttlebutt

Share this:

Share this:

Share this:

Pre-requisites

Building the file structure

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: