JonTheNiceGuy and "The Chief" Peter Bleksley at BSides Liverpool 2019

Review of BSIDES Liverpool 2019

I had the privilege today to attend BSIDES Liverpool 2019. BSIDES is a infosec community conference. The majority of the talks were recorded, and I can strongly recommend making your way through the content when it becomes available.

Full disclosure: While my employer is a sponsor, I was not there to represent the company, I was just enjoying the show. A former colleague (good friend and, while he was still employed by Fujitsu, an FDE - so I think he still is one) is one of the organisers team.

The first talk I saw (aside from the welcome speech) was the keynote by Omri Segev Moyal (@gelossnake) about how to use serverless technologies (like AWS Lambda) to build a malware research platform. The key takeaway I have from that talk was how easy it is to build a simple python lambda script using Chalice. That was fantastic, and I'm looking forward to trying some things with that service!

For various reasons (mostly because I got talking to people), I missed the rest of the morning tracks except for the last talk before lunch. I heard great things about the Career Advice talk by Martin King, and the Social Engineering talk by Tom H, but will need to catch up on those on the videos released after.

Just before lunch we received a talk from "The Chief" (from the Channel 4 TV Series "Hunted"), Peter Bleksley, about an investigation he's currently involved in. This was quite an intense session, and his history (the first 1/4 of his talk) was very interesting. Just before he went in for his talk, I got a selfie with him (which is the "Featured Image" for this post :) )

After lunch, I sat on the Rookies Track, and saw three fantastic talks, from Chrissi Robertson (@frootware) on Imposter Syndrome, Matt (@reversetor) on "Privacy in the age of Convenience" (reminding me of one of my very early talks at OggCamp/BarCamp Manchester) and Jan (@janfajfer) about detecting data leaks on mobile devices with EVPN. All three speakers were fab and nailed their content.

Next up was an unrecorded talk by Jamie (@2sec4u) about WannaCry, as he was part of the company who discovered the "Kill-Switch" domain. He gave a very detailed overview of the timeline about WannaCry, the current situation of the kill-switch, and a view on some of the data from infected-but-dormant machines which are still trying to reach the kill-switch. A very scary but well explained talk. Also, memes and rude words, but it's clearly a subject that needed some levity, being part of a frankly rubbish set of circumstances.

After that was a talk from (two-out-of-six of) The Beer Farmers. This was a talk (mostly) about privacy and the lack of it from the social media systems of Facebook, Twitter and Google. As I listen to The Many Hats Club podcast, on which the Beer Farmers occasionally appear, it was a great experience matching faces to voices.

We finished the day on a talk by Finux (@f1nux) about Machiavelli as his writings (in the form of "The Prince") would apply to Infosec. I was tempted to take a whole slew of photos of the slide deck, but figured I'd just wait for the video to be released, as it would, I'm sure, make more sense in context.

There was a closing talk, and then everyone retired to the bar. All in all, a great day, and I'm really glad I got the opportunity to go (thanks for your ticket Paul (@s7v7ns) - you missed out mate!)

"Feb 11" by "Gordon" on Flickr

How to quickly get the next SemVer for your app

SemVer, short for Semantic Versioning is an easy way of numbering your software versions. They follow the model Major.Minor.Patch, like this 0.9.1 and has a very opinionated view on what is considered a Major "version bump" and what isn't.

Sometimes, when writing a library, it's easy to forget what version you're on. Perhaps you have a feature change you're working on, but also bug fixes to two or three previous versions you need to keep an eye on? How about an easy way of figuring out what that next bump should be?

In a recent conversation on the McrTech slack, Steven [0] mentioned he had a simple bash script for incrementing his SemVer numbers, and posted it over. Naturally, I tweaked it to work more easily for my usecases so, this is *mostly* Steven's code, but with a bit of a wrapper before and after by me :)

So how do you use this? Dead simple, use nextver in a tree that has an existing git tag SemVer to get the next patch number. If you want to bump it to the next minor or major version, try nextver minor or nextver major. If you don't have a git tag, and don't specify a SemVer number, then it'll just assume you're starting from fresh, and return 0.0.1 :)

Want to find more cool stuff from the original author of this work? Below there is a video by the author :)

[0] Steven from Leafshade Software's Recent Youtube Video

Featured image is "Feb 11" by "Gordon" on Flickr and is released under a CC-BY-SA license.

"Key mess" by "Alper Çuğun" on Flickr

Making Ansible Keys more useful in complex and nested data structures

One of the things I miss about Jekyll when I'm working with Ansible is the ability to fragment my data across multiple files, but still have it as a structured *whole* at the end.

For example, given the following directory structure in Jekyll:

+ _data
|
+---+ members
|   +--- member1.yml
|   +--- member2.yml
|
+---+ groups
    +--- group1.yml
    +--- group2.yml

The content of member1.yml and member2.yml will be rendered into site.data.members.member1 and site.data.members.member2 and likewise, group1 and group2 are loaded into their respective variables.

This kind of structure isn't possible in Ansible, because all the data files are compressed into one vars value that we can read. To work around this on a few different projects I've worked on, I've ended up doing the following:

- set_fact:
    my_members: |-
      [
        {%- for var in vars | dict2items -%}
          {%- if var.key | regex_search(my_regex) is not none -%}
            "{{ var.key | regex_replace(my_regex, '') }}": 
              {%- if var.value | string %}"{% endif -%}
              {{ var.value }}
              {%- if var.value | string %}"{% endif %},
          {%- endif -%}
        {%- endfor -%}
      ]
  vars:
    my_regex: '^member_'

So, what this does is to step over all the variables defined (for example, in host_vars\*, group_vars\*, from the gathered facts and from the role you're in - following Ansible's loading precedence), and then checks to see whether the key of that variable name (e.g. "member_i_am_a_member" or "member_1") matches the regular expression (click here for more examples). If it does, the key (minus the regular expression matching piece [using regex_replace]) is added to a dictionary, and the value attached. If the value is actually a string, then it wraps it in quotes.

So, while this doesn't give me my expressive data structure that Jekyll does (no site.data.members.member1.somevalue for me), I do at least get to have my_members.member1.somevalue if I put the right headers in! :)

I'll leave extending this model for doing other sorts of building variables out (for example, something like if var.value['variable_place'] | default('') == 'my_members.member' + current_position) to the reader to work out how they could use something like this in their workflows!

Featured image is "Key mess" by "Alper Çuğun" on Flickr and is released under a CC-BY license.

"Seca" by "Olearys" on Flickr

Getting Started with Terraform on Azure

I'm strongly in the "Ansible is my tool, what needs fixing" camp, when it comes to Infrastructure as Code (IaC) but, I know there are other tools out there which are equally as good. I've been strongly advised to take a look at Terraform from HashiCorp. I'm most familiar at the moment with Azure, so this is going to be based around resources available on Azure.


Late edit: I want to credit my colleague, Pete, for his help getting started with this. While many of the code samples have been changed from what he provided me with, if it hadn't been for these code samples in the first place, I'd never have got started!

Late edit 2: This post was initially based on Terraform 0.11, and I was prompted by another colleague, Jon, that the available documentation still follows the 0.11 layout. 0.12 was released in May, and changes how variables are reused in the code. This post now *should* follow the 0.12 conventions, but if you spot something where it doesn't, check out this post from the Terraform team.


As with most things, there's a learning curve, and I struggled to find a "simple" getting started guide for Terraform. I'm sure this is a failing on my part, but I thought it wouldn't hurt to put something out there for others to pick up and see if it helps someone else (and, if that "someone else" is you, please let me know in the comments!)

Pre-requisites

You need an Azure account for this. This part is very far outside my spectrum of influence, but I'm assuming you've got one. If not, look at something like Digital Ocean, AWS or VMWare :) For my "controller", I'm using Windows Subsystem for Linux (WSL), and wrote the following notes about getting my pre-requisites.

Building the file structure

One quirk with Terraform, versus other tools like Ansible, is that when you run one of the terraform commands (like terraform init, terraform plan or terraform apply), it reads the entire content of any file suffixed "tf" in that directory, so if you don't want a file to be loaded, you need to either move it out of the directory, comment it out, or rename it so it doesn't end .tf. By convention, you normally have three "standard" files in a terraform directory - main.tf, variables.tf and output.tf, but logically speaking, you could have everything in a single file, or each instruction in it's own file. Because this is a relatively simple script, I'll use this standard layout.

The actions I'll be performing are the "standard" steps you'd perform in Azure to build a single Infrastructure as a Service (IAAS) server service:

  • Create your Resource Group (RG)
  • Create a Virtual Network (VNET)
  • Create a Subnet
  • Create a Security Group (SG) and rules
  • Create a Public IP address (PubIP) with a DNS name associated to that IP.
  • Create a Network Interface (NIC)
  • Create a Virtual Machine (VM), supplying a username and password, the size of disks and VM instance, and any post-provisioning instructions (yep, I'm using Ansible for that :) ).

I'm using Visual Studio Code, but almost any IDE will have integrations for Terraform. The main thing I'm using it for is auto-completion of resource, data and output types, also the fact that control+clicking resource types opens your browser to the documentation page on terraform.io.

So, creating my main.tf, I start by telling it that I'm working with the Terraform AzureRM Provider (the bit of code that can talk Azure API).

This simple statement is enough to get Terraform to load the AzureRM, but it still doesn't tell Terraform how to get access to the Azure account. Use az login from a WSL shell session to authenticate.

Next, we create our basic resource, vnet and subnet resources.

But wait, I hear you cry, what are those var.something bits in there? I mentioned before that in the "standard" set of files is a "variables.tf" file. In here, you specify values for later consumption. I have recorded variables for the resource group name and location, as well as the VNet name and subnet name. Let's add those into variables.tf.

When you've specified a resource, you can capture any of the results from that resource to use later - either in the main.tf or in the output.tf files. By creating the resource group (called "rg" here, but you can call it anything from "demo" to "myfirstresourcegroup"), we can consume the name or location with azurerm_resource_group.rg.name and azurerm_resource_group.rg.location, and so on. In the above code, we use the VNet name in the subnet, and so on.

After the subnet is created, we can start adding the VM specific parts - a security group (with rules), a public IP (with DNS name) and a network interface. I'll create the VM itself later. So, let's do this.

BUT WAIT, what's that ${trimspace(data.http.icanhazip.body)}/32 bit there?? Any resources we want to load from the terraform state, but that we've not directly defined ourselves needs to come from somewhere. These items are classed as "data" - that is, we want to know what their values are, but we aren't *changing* the service to get it. You can also use this to import other resource items, perhaps a virtual network that is created by another team, or perhaps your account doesn't have the rights to create a resource group. I'll include a commented out data block in the overall main.tf file for review that specifies a VNet if you want to see how that works.

In this case, I want to put the public IP address I'm coming from into the NSG Rule, so I can get access to the VM, without opening it up to *everyone*. I'm not that sure that my IP address won't change between one run and the next, so I'm using the icanhazip.com service to determine my IP address. But I've not defined how to get that resource yet. Let's add it to the main.tf for now.

So, we're now ready to create our virtual machine. It's quite a long block, but I'll pull certain elements apart once I've pasted this block in.

So, this is broken into four main pieces.

  • Virtual Machine Details. This part is relatively sensible. Name RG, location, NIC, Size and what happens to the disks when the machine powers on. OK.
name                             = "iaas-vm"
location                         = azurerm_resource_group.rg.location
resource_group_name              = azurerm_resource_group.rg.name
network_interface_ids            = [azurerm_network_interface.iaasnic.id]
vm_size                          = "Standard_DS1_v2"
delete_os_disk_on_termination    = true
delete_data_disks_on_termination = true
  • Disk details.
storage_image_reference {
  publisher = "Canonical"
  offer     = "UbuntuServer"
  sku       = "18.04-LTS"
  version   = "latest"
}
storage_os_disk {
  name              = "iaas-os-disk"
  caching           = "ReadWrite"
  create_option     = "FromImage"
  managed_disk_type = "Standard_LRS"
}
  • OS basics: VM Hostname, username of the first user, and it's password. Note, if you want to use an SSH key, this must be stored for Terraform to use without passphrase. If you mention an SSH key here, as well as a password, this can cause all sorts of connection issues, so pick one or the other.
os_profile {
  computer_name  = "iaas"
  admin_username = var.ssh_user
  admin_password = var.ssh_password
}
os_profile_linux_config {
  disable_password_authentication = false
}
  • And lastly, provisioning. I want to use Ansible for my provisioning. In this example, I have a basic playbook stored locally on my Terraform host, which I transfer to the VM, install Ansible via pip, and then execute ansible-playbook against the file I uploaded. This could just as easily be a git repo to clone or a shell script to copy in, but this is a "simple" example.
provisioner "remote-exec" {
  inline = ["mkdir /tmp/ansible"]

  connection {
    type     = "ssh"
    host     = azurerm_public_ip.iaaspubip.fqdn
    user     = var.ssh_user
    password = var.ssh_password
  }
}

provisioner "file" {
  source = "ansible/"
  destination = "/tmp/ansible"

  connection {
    type     = "ssh"
    host     = azurerm_public_ip.iaaspubip.fqdn
    user     = var.ssh_user
    password = var.ssh_password
  }
}

provisioner "remote-exec" {
  inline = [
    "sudo apt update > /tmp/apt_update || cat /tmp/apt_update",
    "sudo apt install -y python3-pip > /tmp/apt_install_python3_pip || cat /tmp/apt_install_python3_pip",
    "sudo -H pip3 install ansible > /tmp/pip_install_ansible || cat /tmp/pip_install_ansible",
    "ansible-playbook /tmp/ansible/main.yml"
  ]

  connection {
    type     = "ssh"
    host     = azurerm_public_ip.iaaspubip.fqdn
    user     = var.ssh_user
    password = var.ssh_password
  }
}

This part of code is done in three parts - create upload path, copy the files in, and then execute it. If you don't create the upload path, it'll upload just the first file it comes to into the path specified.

Each remote-exec and file provisioner statement must include the hostname, username and either the password, or SSH private key. In this example, I provide just the password.

So, having created all this lot, you need to execute the terraform workload. Initially you do terraform init. This downloads all the provisioners and puts them into the same tree as these .tf files are stored in. It also resets the state of the terraform discovered or created datastore.

Next, you do terraform plan -out tfout. Technically, the tfout part can be any filename, but having something like tfout marks it as clearly part of Terraform. This creates the tfout file with the current state, and whatever needs to change in the Terraform state file on it's next run. Typically, if you don't use a tfout file within about 20 minutes, it's probably worth removing it.

Finally, once you've run your plan stage, now you need to apply it. In this case you execute terraform apply tfout. This tfout is the same filename you specified in terraform plan. If you don't include -out tfout on your plan (or even run a plan!) and tfout in your apply, then you can skip the terraform plan stage entirely.

When I ran this, with a handful of changes to the variable files, I got this result:

Once you're done with your environment, use terraform destroy to shut it all down... and enjoy :)

The full source is available in the associated Gist. Pull requests and constructive criticism are very welcome!

Featured image is "Seca" by "Olearys" on Flickr and is released under a CC-BY license.

"pulpit and bible" by "Joel Kramer" on Flickr

Preaching about Firefox Containers (and how they can change your Internet life)

I want to preach for a few minutes about Containers in Firefox. This is not like Docker containers, a Snap Package (using cgroups), or Shipping Containers, but instead a way of describing how each tab protects you from tracking.

Here's a quick lesson in how the web works. Each website you visit, when you get the HTML page, it might *also* ask you store a small text file, a "Cookie" that then gets handed *back* to that site the next time you visit. It's an easy way of saying "I've been here before, you know me already".

This doesn't just happen when you visit a web page (unless the web page is really *really* simple), it also happens for each resource on that page. If the page also asks for an image (say, the logo of a social media network), a script (say, a banner bar from an advertising network) or a font (yep, web fonts are also a thing!), each one of those also gets to say "here's a Cookie, keep it for the next time you come back".

For a few years, there have been ad-blockers (my favourite two are "uBlock Origin" and "Privacy Badger"), which can stop the content from ever being loaded... but it's an arms race. The ad-blockers stop content from being loaded (mostly it's just to stop the adverts, but the other stuff is a benefit that they've kept on doing), then the tracking firms do something else to make it so their content is loaded, and so-on. Firefox also has "Private Browsing Mode", which can stop "third party cookies" (the ones from each of the additional sources on the page) from being shared... but I always think that Private Browsing mode looks shady.

In the last couple of years, Firefox started an experiment called "Firefox Multi-Account Containers" (or just "Containers" for short). This is designed to create a whole new "state" for each browser tab, that's shared between those containers.

You can mark particular websites as being part of a particular container, so Twitter, Facebook and GMail all end up in my "Personal" container, whereas the sites I need for work are in the "Work" container.

For a while I was using them to support family members ("I just need you to log into your GMail account for me to have a poke around... let me create a new container for your account", or "Let's have a look at why you're getting those Facebook posts. Can you log in in this container here?").

Then I needed it to separate out a couple of Twitter accounts I'm responsible for (when I use the "Switch Containers" extension to jump between them)... Then I found a new extension which upgraded how I use them "Temporary Containers". With a couple of tweaks (see below), this makes every new tab into it's own container... so it's a bit like Private Browsing Mode, but one which dynamically turns itself into a "non-private mode" if you hit the right URL.

So, this is my work-flow - it might not work for you, but equally, it might! When I open a new tab, or visit a website that isn't already categorised as a "Personal", "Work" (or so-on) container, I get taken to a new "Temporary" container.

A Temporary Container window (note the "tmp2" in the address bar)

I then ask myself if this is something I need to log into with one of my existing containers (e.g. Google, Facebook, Twitter, Github, Azure, AWS etc), and if so, I'll "Switch Containers" to that container (e.g. Personal).

Switching containers with the "Switch Containers" button

If I think that I always want to open it here then I'll click on the "Containers" button in the bar, and select "Always open in 'Personal'".

Selecting "Always open in Personal"

If I've categorised something that I need to swap to something else (e.g. Twitter for another account, or a family member's GMail account), then I explicitly "Switch Containers" or open a tab in that container first, and then go to the website.

If I need a new container for this window, I use the + symbol next to the "Edit Containers" button in the containers button in the window bar.

Adding a new container with the + button in the Open Container Tab dialogue

I also use the "Open Bookmark in Container" extension, for when I'm using bookmarks, as, by default, these can't be opened in a container. I also use the "Containers Theme" extension, as can be seen by the colour changes in the above screenshots.

While this is fully available for Firefox on Desktop, it's not yet available on Firefox for Android or Firefox for iOS, and there's no word on whether it will come at all...

Featured image is "pulpit and bible" by "Joel Kramer" on Flickr and is released under a CC-BY license.

"Untitled" by "Ryan Dickey" on Flickr

Run an Ansible Playbook against a Check Point Gaia node running R80+

In Check Point Gaia R77, if you wanted to run Ansible against this node, you were completely out of luck. The version of Python on the host was broken, modules were missing and ... well, it just wouldn't work.

Today, I'm looking at running some Ansible playbooks against Check Point R80 nodes. Here's some steps you need to get through to make it work.

  1. Make sure the user that Ansible is going to be using has the shell /bin/bash. If you don't have this set up, the command is: set user ansible shell /bin/bash.
  2. If you want a separate user account to do ansible actions, run these commands:
    add user ansible uid 9999 homedir /home/ansible
    set user ansible password-hash $1$D3caF9$B4db4Ddecafbadnogoood (note this hash is not valid!)
    add rba user ansible roles adminRole
    set user ansible shell /bin/bash
  3. Make sure your inventory specifies the right path for your Python binary. In the next code block you'll see my inventory for three separate Check Point R80+ nodes. Note that I'll only be targetting the "checkpoint" group, but that I'm using the r80_10, r80_20 and r80_30 groups to load the variables into there. I could, alternatively, add these in as values in group_vars/r80_10.yml and so on, but I find keeping everything to do with my connection in one place much cleaner. The python interpreter is in a separate path for each version time, and if you don't specify ansible_ssh_transfer_method=piped you'll get a message like this: [WARNING]: sftp transfer mechanism failed on [cpr80-30]. Use ANSIBLE_DEBUG=1 to see detailed information (fix from Add pipeline-ish method using dd for file transfer over SSH (#18642) on the Ansible git repo)
[checkpoint]
cpr80-10        ansible_user=admin      ansible_password=Sup3rS3cr3t-
cpr80-20        ansible_user=admin      ansible_password=Sup3rS3cr3t-
cpr80-30        ansible_user=admin      ansible_password=Sup3rS3cr3t-

[r80_10]
cpr80-10

[r80_20]
cpr80-20

[r80_30]
cpr80-30

[r80_10:vars]
ansible_ssh_transfer_method=piped
ansible_python_interpreter=/opt/CPsuite-R80/fw1/Python/bin/python

[r80_20:vars]
ansible_ssh_transfer_method=piped
ansible_python_interpreter=/opt/CPsuite-R80.20/fw1/Python/bin/python

[r80_30:vars]
ansible_ssh_transfer_method=piped
ansible_python_interpreter=/opt/CPsuite-R80.30/fw1/Python/bin/python

And there you have it, one quick "ping" check later...

$ ansible -m 'ping' -i hosts checkpoint
cpr80-10 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
cpr80-30 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
cpr80-20 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

One quick word of warning though, don't use gather_facts: true or the setup: module. Both of these still rely on missing libraries on the Check Point nodes, and won't work... But then again, you can get whatever you need from shell commands..... right? ;)

Featured image is “Untitled” by “Ryan Dickey” on Flickr and is released under a CC-BY license.

"Tower" by " Yijun Chen" on Flickr

Building a Gitlab and Ansible Tower (AWX) Demo in Vagrant with Ansible

TL;DR - I created a repository on GitHub‌ containing a Vagrantfile and an Ansible Playbook to build a VM running Docker. That VM hosts AWX (Ansible Tower's upstream open-source project) and Gitlab.

A couple of years ago, a colleague created (and I enhanced) a Vagrant and Ansible playbook called "Project X" which would run an AWX instance in a Virtual Machine. It's a bit heavy, and did a lot of things to do with persistence that I really didn't need, so I parked my changes and kept an eye on his playbook...

Fast-forward to a week-or-so ago. I needed to explain what a Git/Ansible Workflow would look like, and so I went back to look at ProjectX. Oh my, it looks very complex and consumed a lot of roles that, historically, I've not been that impressed with... I just needed the basics to run AWX. Oh, and I also needed a Gitlab environment.

I knew that Gitlab had a docker-based install, and so does AWX, so I trundled off to find some install guides. These are listed in the playbook I eventually created (hence not listing them here). Not all the choices I made were inspired by those guides - I wanted to make quite a bit of this stuff "build itself"... this meant I wanted users, groups and projects to be created in Gitlab, and users, projects, organisations, inventories and credentials to be created in AWX.

I knew that you can create Docker Containers in Ansible, so after I'd got my pre-requisites built (full upgrade, docker installed, pip libraries installed), I add the gitlab-ce:latest docker image, and expose some ports. Even now, I'm not getting the SSH port mapped that I was expecting, but ... it's no disaster.

I did notice that the Gitlab service takes ages to start once the container is marked as running, so I did some more digging, and found that the uri module can be used to poll a URL. It wasn't documented well how you can make it keep polling until you get the response you want, so ... I added a PR on the Ansible project's github repo for that one (and I also wrote a blog post about that earlier too).

Once I had a working Gitlab service, I needed to customize it. There are a bunch of Gitlab modules in Ansible but since a few releases back of Gitlab, these don't work any more, so I had to find a different way. That different way was to run an internal command called "gitlab-rails". It's not perfect (so it doesn't create repos in your projects) but it's pretty good at giving you just enough to build your demo environment. So that's getting Gitlab up...

Now I need to build AWX. There's lots of build guides for this, but actually I had most luck using the README in their repository (I know, who'd have thought it!??!) There are some "Secrets" that should be changed in production that I'm changing in my script, but on the whole, it's pretty much a vanilla install.

Unlike the Gitlab modules, the Ansible Tower modules all work, so I use these to create the users, credentials and so-on. Like the gitlab-rails commands, however, the documentation for using the tower modules is pretty ropey, and I still don't have things like "getting your users to have access to your organisation" working from the get-go, but for the bulk of the administration, it does "just work".

Like all my playbooks, I use group_vars to define the stuff I don't want to keep repeating. In this demo, I've set all the passwords to "Passw0rd", and I've created 3 users in both AWX and Gitlab - csa, ops and release - indicative of the sorts of people this demo I ran was aimed at - Architects, Operations and Release Managers.

Maybe, one day, I'll even be able to release the presentation that went with the demo ;)

On a more productive note, if you're doing things with the tower_ modules and want to tell me what I need to fix up, or if you're doing awesome things with the gitlab-rails tool, please visit the repo with this automation code in, and take a look at some of my "todo" items! Thanks!!

Featured image is "Tower" by "Yijun Chen" on Flickr and is released under a CC-BY-SA license.

"funfair action" by "Jon Bunting" on Flickr

Improving the speed of Azure deployments in Ansible with Async

Recently I was building a few environments in Azure using Ansible, and found this stanza which helped me to speed things up.

  - name: "Schedule UDR Creation"
    azure_rm_routetable:
      resource_group: "{{ resource_group }}"
      name: "{{ item.key }}_udr"
    loop: "{{ routetables | dict2items }}"
    loop_control:
        label: "{{ item.key }}_udr"
    async: 1000
    poll: 0
    changed_when: False
    register: sleeper

  - name: "Check UDRs Created"
    async_status:
      jid: "{{ item.ansible_job_id }}"
    register: sleeper_status
    until: sleeper_status.finished
    retries: 500
    delay: 4
    loop: "{{ sleeper.results|flatten(levels=1) }}"
    when: item.ansible_job_id is defined
    loop_control:
      label: "{{ item._ansible_item_label }}"

What we do here is to start an action with an "async" time (to give the Schedule an opportunity to register itself) and a "poll" time of 0 (to prevent the Schedule from waiting to be finished). We then tell it that it's "never changed" (changed_when: False) because otherwise it always shows as changed, and to register the scheduled item itself as a "sleeper".

After all the async jobs get queued, we then check the status of all the scheduled items with the async_status module, passing it the registered job ID. This lets me spin up a lot more items in parallel, and then "just" confirm afterwards that they've been run properly.

It's not perfect, and it can make for rather messy code. But, it does work, and it's well worth giving it the once over, particularly if you've got some slow-to-run tasks in your playbook!

Featured image is "funfair action" by "Jon Bunting" on Flickr and is released under a CC-BY license.

A web browser with the example.com web page loaded

Working around the fact that Ansible’s URI module doesn’t honour the no_proxy variable…

An Ansible project I've been working on has tripped me up this week. I'm working with some HTTP APIs and I need to check early whether I can reach the host. To do this, I used a simple Ansible Core Module which lets you call an HTTP URI.

- uri:
    follow_redirects: none
    validate_certs: False
    timeout: 5
    url: "http{% if ansible_https | default(True) %}s{% endif %}://{{ ansible_host }}/login"
  register: uri_data
  failed_when: False
  changed_when: False

This all seems pretty simple. One of the environments I'm working in uses the following values in their environment:

http_proxy="http://192.0.2.1:8080"
https_proxy="http://192.0.2.1:8080"
no_proxy="10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 192.0.2.0/24, 198.51.100.0/24, 203.0.113.0/24"

And this breaks the uri module, because it tries to punt everything through the proxy if the "no_proxy" contains CIDR values (like 192.0.2.0/24) (there's a bug raised for this)... So here's my fix!

- set_fact:
    no_proxy_match: |
      {
        {% for no_proxy in (lookup('env', 'no_proxy') | replace(',', '') ).split() %}
          {% if no_proxy| ipaddr | type_debug != 'NoneType' %}
            {% if ansible_host | ipaddr(no_proxy) | type_debug != 'NoneType' %}
              "match": "True"
            {% endif %}
          {% endif %}
        {% endfor %}
      }

- uri:
    follow_redirects: none
    validate_certs: False
    timeout: 5
    url: "http{% if ansible_https | default(True) %}s{% endif %}://{{ ansible_host }}/login"
  register: uri_data
  failed_when: False
  changed_when: False
  environment: "{ {% if no_proxy_match.match | default(False) %}'no_proxy': '{{ ansible_host }}'{% endif %} }"

So, let's break this down.

The key part to this script is that we need to override the no_proxy environment variable with the IP address that we're trying to address (so that we're not putting 16M addresses for 10.0.0.0/8 into no_proxy, for example). To do that, we use the exact same URI block, except for the environment line at the end.

In turn, the set_fact block steps through the no_proxy values, looking for IP Addresses to check ({% if no_proxy | ipaddr ... %}‌ says "if the no_proxy value is an IP Address, return it, but if it isn't, return a 'None' value") and if it's an IP address or subnet mask, it checks to see whether the IP address of the host you're trying to reach falls inside that IP Address or Subnet Mask ({% if ansible_host | ipaddr(no_proxy) ... %} says "if the ansible_host address falls inside the no_proxy range, then return it, otherwise return a 'None' value"). Both of these checks say "If this previous check returns anything other than a 'None' value, do the next thing", and on the last check, the "next" thing is to set the flag 'match' to 'true'. When we get to the environment variable, we say "if match is not true, it's false, so don't put a value in there".

So that's that! Yes, I could merge the set_fact block into the environment variable, but I do end up using that a fair amount. And really, if it was merged, that would be even MORE complicated to pick through.

I have raised a pull request on the Ansible project to update the documentation, so we'll see whether we end up with people over here looking for ways around this issue. If so, let me know in the comments below! Thanks!!