Ansible Behaviour Change

For those of you who are working with #Ansible… Ansible 2.5 is out, and has an unusual documentation change around a key Ansible concept – `with_` loops Where you previously had:

with_dict: "{{ your_fact }}"
or
with_subelements:
- "{{ your_fact }}"
- some_subkey

This now should be written like this:

loop: "{{ lookup('dict', your_fact) }}"
and
loop: "{{ lookup('subelements', your_fact, 'some_subkey') }}"

Fear not, I hear you say, It’s fine, of course the documentation suggests that this is “how it’s always been”…… HA HA HA Nope. This behaviour is new as of 2.5, and needs ansible to be updated to the latest version. As far as I can tell, there’s no way to indicate to Ansible “Oh, BTW, this needs to be running on 2.5 or later”… so I wrote a role that does that for you.

ansible-galaxy install JonTheNiceGuy.version-check

You’re welcome :)

More useful URLs:

Creating OpenStack “Allowed Address Pairs” for Clusters with Ansible

This post came about after a couple of hours of iterations, so I can’t necessarily quote all the sources I worked from, but I’ll do my best!

In OpenStack (particularly in the “Kilo” release I’m working with), if you have a networking device that will pass traffic on behalf of something else (e.g. Firewall, IDS, Router, Transparent Proxy) you need to tell the virtual NIC that the interface is allowed to pass traffic for other IP addresses, as OpenStack applies by default a “Same Origin” firewall rule to the interface. Defining this in OpenStack is more complex than it could be, because for some reason, you can’t define 0.0.0.0/0 as this allowed address pair, so instead you have to define 0.0.0.0/1 and 128.0.0.0/1.

Here’s how you define those allowed address pairs (note, this assumes you’ve got some scaffolding in place to define things like “network_appliance”):

allowed_address_pairs: "{% if (item.0.network_appliance|default('false')|lower() == 'true') or (item.1.network_appliance|default('false')|lower() == 'true') %}[{'ip_address': '0.0.0.0/1'}, {'ip_address': '128.0.0.0/1'}]{% else %}{{ item.0.allowed_address_pairs|default(omit) }}{% endif %}"

OK, so we’ve defined the allowed address pairs! We can pass traffic across our firewall. But (and there’s always a but), the product I’m working with at the moment has a floating MAC address in a cluster, when you define an HA pair. They have a standard schedule for how each port’s floating MAC is assigned… so here’s what I’ve ended up with (and yes, I know it’s a mess!)

allowed_address_pairs: "{% if (item.0.network_appliance|default('false')|lower() == 'true') or (item.1.network_appliance|default('false')|lower() == 'true') %}[{'ip_address': '0.0.0.0/1'},{'ip_address': '128.0.0.0/1'}{% if item.0.ha is defined and item.0.ha != '' %}{% for vdom in range(0,40, 10) %},{'ip_address': '0.0.0.0/1','mac_address': '{{ item.0.floating_mac_prefix|default(item.0.image.floating_mac_prefix|default(floating_mac_prefix)) }}:{% if item.0.ha.group_id|default(0) < 16 %}0{% endif %}{{ '%0x' | format(item.0.ha.group_id|default(0)|int) }}:{% if vdom+(item.1.interface|default('1')|replace('port', '')|int)-1 < 16 %}0{% endif %}{{ '%0x' | format(vdom+(item.1.interface|default('1')|replace('port', '')|int)-1) }}'}, {'ip_address': '128.0.0.0/1','mac_address': '{{ item.0.floating_mac_prefix|default(item.0.image.floating_mac_prefix|default(floating_mac_prefix)) }}:{% if item.0.ha.group_id|default(0) < 16 %}0{% endif %}{{ '%0x' | format(item.0.ha.group_id|default(0)|int) }}:{% if vdom+(item.1.interface|default('0')|replace('port', '')|int)-1 < 16 %}0{% endif %}{{ '%0x' | format(vdom+(item.1.interface|default('1')|replace('port', '')|int)-1) }}'}{% endfor %}{% endif %}]{% else %}{{ item.0.allowed_address_pairs|default(omit) }}{% endif %}"

Let's break this down a bit. The vendor says that each port gets a standard prefix, (e.g. DE:CA:FB:AD) then the penultimate octet is the "Cluster ID" number in hex, and then the last octet is the sum of the port number (zero-indexed) added to a VDOM number, which increments in 10's. We're only allowed to assign 10 "allowed address pairs" to an interface, so I've got the two originals (which are assigned to "whatever" the defined mac address is of the interface), and four passes around. Other vendors (e.g. this one) do things differently, so I'll probably need to revisit this once I know how the next one (and the next one... etc.) works!

So, we have here a few parts to make that happen.

The penultimate octet, which is the group ID in hex needs to be two hex digits long, and without adding more python modules to our default machines, we can't use a "pad" filter (to add 0's to the beginning of the mac octets), so we do that by hand:

{% if item.0.ha.group_id|default(0) < 16 %}0{% endif %}

And here's how to convert the group ID into a hex number:

{{ '%0x' | format(item.0.ha.group_id|default(0)|int) }}

Then the next octet is the sum of the VDOM and PortID. First we need to loop around the VDOMs. We don't always know whether we're going to be adding VDOMs until after deployment has started, so here we will assume we've got 3 VDOMs (plus VDOM "0" for management) as it doesn't really matter if we don't end up using them. We create the vdom variable like this:

{% for vdom in range(0, 40, 10) %} STUFF {% endfor %}

We need to put the actual port ID in there too. As we're using a with_subelement loop we can't create an increment, but what we can do is ensure we're recording the interface number. This only works here because the vendor has a sequential port number (port1, port2, etc). We'll need to experiment further with other vendors! So, here's how we're doing this. We already know how to create a hex number, but we do need to use some other Jinja2 filters here:

{{ '%0x' | format(vdom+(item.1.interface|default('1')|replace('port', '')|int)-1) }}

Let's pull this apart a bit further. item.1.interface is the name of the interface, and if it doesn't exist (using the |default('1') part) we replace it with the string "1". So, let's replace that variable with a "normal" value.

{{ '%0x' | format(vdom+("port1"|replace('port', '')|int)-1) }}

Next, we need to remove the word "port" from the string "port1" to make it just "1", so we use the replace filter to strip part of that value out. Let's do that:

{{ '%0x' | format(vdom+("1"|int)-1) }}

After that, we need to turn the string "1" into the literal number 1:

{{ '%0x' | format(vdom+1-1) }}

We loop through vdom several times, but let's pick one instance of that at random - 30 (the fourth iteration of the vdom for-loop):

{{ '%0x' | format(30+1-1) }}

And then we resolve the maths:

{{ '%0x' | format(30) }}

And then the |format(30) turns the '%0x' into the value "1e"

Assuming the vendor prefix is, as I mentioned, 'de:ca:fb:ad:' and the cluster ID is 0, this gives us the following resulting allowed address pairs:

[
{"ip_address": "0.0.0.0/1"},
{"ip_address": "128.0.0.0/1"},
{"ip_address": "0.0.0.0/1", "mac_address": "de:ca:fb:ad:00:00"},
{"ip_address": "128.0.0.0/1", "mac_address": "de:ca:fb:ad:00:00"},
{"ip_address": "0.0.0.0/1", "mac_address": "de:ca:fb:ad:00:0a"},
{"ip_address": "128.0.0.0/1", "mac_address": "de:ca:fb:ad:00:0a"},
{"ip_address": "0.0.0.0/1", "mac_address": "de:ca:fb:ad:00:14"},
{"ip_address": "128.0.0.0/1", "mac_address": "de:ca:fb:ad:00:14"},
{"ip_address": "0.0.0.0/1", "mac_address": "de:ca:fb:ad:00:1e"},
{"ip_address": "128.0.0.0/1", "mac_address": "de:ca:fb:ad:00:1e"}
]

I hope this has helped you!

Sources of information:

Defining Networks with Ansible

In my day job, I’m using Ansible to provision networks in OpenStack. One of the complaints I’ve had about the way I now define them is that the person implementing the network has to spell out all the network elements – the subnet size, DHCP pool, the addresses of the firewalls and names of those items. This works for a manual implementation process, but is seriously broken when you try to hand that over to someone else to implement. Most people just want something which says “Here is the network I want to implement – 192.0.2.0/24″… and let the system make it for you.

So, I wrote some code to make that happen. It’s not perfect, and it’s not what’s in production (we have lots more things I need to add for that!) but it should do OK with an IPv4 network.

Hope this makes sense!

---
- hosts: localhost
  vars:
  - networks:
      # Defined as a subnet with specific router and firewall addressing
      external:
        subnet: "192.0.2.0/24"
        firewall: "192.0.2.1"
        router: "192.0.2.254"
      # Defined as an IP address and CIDR prefix, rather than a proper network address and CIDR prefix
      internal_1:
        subnet: "198.51.100.64/24"
      # A valid smaller network and CIDR prefix
      internal_2:
        subnet: "203.0.113.0/27"
      # A tiny CIDR network
      internal_3:
        subnet: "203.0.113.64/30"
      # These two CIDR networks are unusable for this environment
      internal_4:
        subnet: "203.0.113.128/31"
      internal_5:
        subnet: "203.0.113.192/32"
      # A massive CIDR network
      internal_6:
        subnet: "10.0.0.0/8"
  tasks:
  # Based on https://stackoverflow.com/a/47631963/5738 with serious help from mgedmin and apollo13 via #ansible on Freenode
  - name: Add router and firewall addressing for CIDR prefixes < 30     set_fact:       networks: >
        {{ networks | default({}) | combine(
          {item.key: {
            'subnet': item.value.subnet | ipv4('network'),
            'router': item.value.router | default((( item.value.subnet | ipv4('network') | ipv4('int') ) + 1) | ipv4),
            'firewall': item.value.firewall | default((( item.value.subnet | ipv4('broadcast') | ipv4('int') ) - 1) | ipv4),
            'dhcp_start': item.value.dhcp_start | default((( item.value.subnet | ipv4('network') | ipv4('int') ) + 2) | ipv4),
            'dhcp_end': item.value.dhcp_end | default((( item.value.subnet | ipv4('broadcast') | ipv4('int') ) - 2) | ipv4)
          }
        }) }}
    with_dict: "{{ networks }}"
    when: item.value.subnet | ipv4('prefix') < 30   - name: Add router and firewall addressing for CIDR prefixes = 30     set_fact:       networks: >
        {{ networks | default({}) | combine(
          {item.key: {
            'subnet': item.value.subnet | ipv4('network'),
            'router': item.value.router | default((( item.value.subnet | ipv4('network') | ipv4('int') ) + 1) | ipv4),
            'firewall': item.value.firewall | default((( item.value.subnet | ipv4('broadcast') | ipv4('int') ) - 1) | ipv4)
          }
        }) }}
    with_dict: "{{ networks }}"
    when: item.value.subnet | ipv4('prefix') == 30
  - debug:
      var: networks

Using inspec to test your ansible

Over the past few days I’ve been binge listening to the Arrested Devops podcast. In one of the recent episodes (“Career Change Into DevOps With Michael Hedgpeth, Annie Hedgpeth, And Megan Bohl (ADO102)“) one of the interviewees mentions that she got started in DevOps by using Inspec.

Essentially, inspec is a way of explaining “this is what my server must look like”, so you can then test these statements against a built machine… effectively letting you unit test your provisioning scripts.

I’ve already built a fair bit of my current personal project using Ansible, so I wasn’t exactly keen to re-write everything from scratch, but it did make me think that maybe I should have a common set of tests to see how close my server was to the hardening “Benchmark” guides from CIS… and that’s pretty easy to script in inspec, particularly as the tests in those documents list the “how to test” and “how to remediate” commands to execute.

These are in the process of being drawn up (so far, all I have is an inspec test saying “confirm you’re running on Ubuntu 16.04″… not very complex!!) but, from the looks of things, the following playbook would work relatively well!

A brief guide to using vagrant-aws

CCHits was recently asked to move it’s media to another host, and while we were doing that we noticed that many of the Monthly shows were broken in one way or another…

Cue a massive rebuild attempt!

We already have a “ShowRunner” script, which we use with a simple Vagrant machine, and I knew you can use other hypervisor “providers”, and I used to use AWS to build the shows, so why not wrap the two parts together?

Firstly, I installed the vagrant-aws plugin:

vagrant plugin install vagrant-aws

Next I amended my Vagrantfile with the vagrant-aws values mentioned in the plugin readme:

Vagrant.configure(2) do |config|
    config.vm.provider :aws do |aws, override|
    config.vm.box = "ShowMaker"
    aws.tags = { 'Name' => 'ShowMaker' }
    config.vm.box_url = "https://github.com/mitchellh/vagrant-aws/raw/master/dummy.box"
    
    # AWS Credentials:
    aws.access_key_id = "DECAFBADDECAFBADDECAF"
    aws.secret_access_key = "DeadBeef1234567890+AbcdeFghijKlmnopqrstu"
    aws.keypair_name = "TheNameOfYourSSHKeyInTheEC2ManagementPortal"
    
    # AWS Location:
    aws.region = "us-east-1"
    aws.region_config "us-east-1", :ami => "ami-c29e1cb8" # If you pick another region, use the relevant AMI for that region
    aws.instance_type = "t2.micro" # Scale accordingly
    aws.security_groups = [ "sg-1234567" ] # Note this *MUST* be an SG ID not the name
    aws.subnet_id = "subnet-decafbad" # Pick one subnet from https://console.aws.amazon.com/vpc/home
    
    # AWS Storage:
    aws.block_device_mapping = [{
      'DeviceName' => "/dev/sda1",
      'Ebs.VolumeSize' => 8, # Size in GB
      'Ebs.DeleteOnTermination' => true,
      'Ebs.VolumeType' => "GP2", # General performance - you might want something faster
    }]
    
    # SSH:
    override.ssh.username = "ubuntu"
    override.ssh.private_key_path = "/home/youruser/.ssh/id_rsa" # or the SSH key you've generated
    
    # /vagrant directory - thanks to https://github.com/hashicorp/vagrant/issues/5401
    override.nfs.functional = false # It tries to use NFS - use RSYNC instead
  end
  config.vm.box = "ubuntu/trusty64"
  config.vm.provision "shell", path: "./run_setup.sh"
  config.vm.provision "shell", run: "always", path: "./run_showmaker.sh"
end

Of course, if you try to put this into your Github repo, it’s going to get pillaged and you’ll be spending lots of money on monero mining very quickly… so instead, I spotted this which you can do to separate out your credentials:

At the top of the Vagrantfile, add these two lines:

require_relative 'settings_aws.rb'
include SettingsAws

Then, replace the lines where you specify a “secret”, like this:

    aws.access_key_id = AWS_ACCESS_KEY
    aws.secret_access_key = AWS_SECRET_KEY

Lastly, create a file “settings_aws.rb” in the same path as your Vagrantfile, that looks like this:

module SettingsAws
    AWS_ACCESS_KEY = "DECAFBADDECAFBADDECAF"
    AWS_SECRET_KEY = "DeadBeef1234567890+AbcdeFghijKlmnopqrstu"
end

This file then can be omitted from your git repository using a .gitignore file.

Running Google MusicManager for two profiles

I’ve previously made mention of my addiction to Google Play Music… but I was called out recently, and asked about the script I used at the time. I’m sorry to say that I have had some issues with it, and instead, have resorted to using X forwarding. Here’s how I do it.

I create a user account for that other person (note, GMM will only let you upload to 3 accounts using this method. For any more, you’ll need a virtual machine!).

I then create an SSH public/private key with no passphrase.

ssh-keygen -b 2048 -N “” -C “$(whoami)@localhost” -f ~/.ssh/gmm.id_rsa

I write the public key into that new user’s .ssh/authorized_keys, by running:

ssh-copy-id -i ~/.ssh/gmm.id_rsa bloggsf@localhost

I will be prompted for the password of that account.

Finally, I create this script:

This is then added to the startup tasks of my headless-but-running-a-desktop machine.

One to read: “SKIP grep, use AWK” / ”Awk Tutorial, part {1,2,3,4}”

Do you use this pattern in your sh/bash/zsh/etc-sh scripts?

cat somefile | grep 'some string' | awk '{print $2}'

If so, you can replace that as follows:

cat somefile | awk '/some string/ {print $2}'

Or how about this?

grep -v 'something' < somefile | awk '{print $0}'

Try this:

awk '! /something/ {print $0}' < somefile

Ooo OK, how about if you want to get all actions performed by users when the ISO formatted dates (Y-m-d) match the first day of the month, but where you don’t want to also match January (unless you’re talking about the first of January)…

# echo 'BLOGGSF 2001-01-23 SOME_ACTION' | awk '$2 ~ /-01$/ {print $1, $3}'
(EMPTY LINE)
# echo 'BLOGGSF 2002-02-01 SOME_ACTION' | awk '$2 ~ /-01$/ {print $1, $3}'
BLOGGSF SOME_ACTION

This is so cool! Thanks to the tutorials “SKIP grep, use AWK” and the follow-up tutorials starting here…

Today I learned… Cloud-init doesn’t like you repeating the same things

Because of templates I was building in my post “Today I learned… Ansible Include Templates”, I thought you could repeat the same sections over again. Here’s a snippet of something like what I’d built (after combining lots of templates together):

Note this is a non-working code sample!


#cloud-config
packages:
- iperf
- git

write_files:
- content: {% include 'files/public_key.j2' %}
  path: /root/.ssh/authorized_keys
  owner: root:root
  permission: '0600'
- content: {% include 'files/private_key.j2' %}
  path: /root/.ssh/id_rsa
  owner: root:root
  permission: '0600'

packages:
- byobu

write_files:
- content: |
    #!/bin/bash
    git clone {{ test_scripts }} /root/iperf_scripts
    bash /root/iperf_scripts/run_test.sh
  path: /root/run_test
  owner: root:root
  permission: '0700'

runcmd:
- /root/run_test

I’d get *bits* of it to run – basically, the last file, the last package and the last runcmd… but not all of it.

Turns out, cloud-init doesn’t like having to rebuild all the fragments together. Instead, you need to put them all together, so the write_files items, and the packages items all live in the same area.

Which, when you think about what it’s doing, which is that the parent lines are defining a variable called… well, whatever that line is, and if you replace it, it’s only going to keep the last one, then it all makes sense really!

Today I learned… that you can look at the “cloud-init” files on your target server…

Today I have been debugging why my Cloud-init scripts weren’t triggering on my Openstack environment.

I realised that something was wrong when I tried to use the noVNC console[1] with a password I’d set… no luck. So, next I ran a command to review the console logs[2], and saw a message (now, sadly, long gone – so I can’t even include it here!) suggesting there was an issue parsing my YAML file. Uh oh!

I’m using Ansible’s os_server module, and using templates to complete the userdata field, which in turn gets populated as cloud-init scripts…. and so clearly I had two ways to debug this – prefix my ansible playbook with a few debug commands, but then that can get messy… OR SSH into the box, and look through the logs. I knew I could SSH in, so the cloud-init had partially fired, but it just wasn’t parsing what I’d submitted. I had a quick look around, and found a post which mentioned debugging cloud-init. This mentioned that there’s a path (/var/lib/cloud/instances/$UUID/) you can mess around in, to remove some files to “fool” cloud-init into thinking it’s not been run… but, I reasoned, why not just see what’s there.

And in there, was the motherlode – user-data.txt…. bingo.

In the jinja2 template I was using to populate the userdata, I’d referenced another file, again using a template. It looks like that template needs an extra line at the end, otherwise, it all runs together.

Whew!

This does concern me a little, as I had previously been using this stanza to “simply” change the default user password to something a little less complicated:


#cloud-config
ssh_pwauth: True
chpasswd:
  list: |
    ubuntu:{{ default_password }}
  expire: False

But now that I look at the documentation, I realise you can also specify that as a pre-hashed value (in which case, you would suffix that default_password item above with |password_hash('sha512')) which makes it all better again!

[1] If you run openstack --os-cloud cloud_a console url show servername gives you a URL to visit that has an HTML5 based VNC-ish client. Note the “cloud_a” and “servername” should be replaced by your clouds.yml reference and the server name or server ID you want to connect to.
[2] Like before, openstack --os-cloud cloud_a console log show servername gives you the output of the boot sequence (e.g. dmesg plus the normal startup commands, and finally, cloud-init). It can be useful. Equally, it’s logs… which means there’s a lot to wade through!