A few months ago, I was working on a personal project that needed a separate, offline linux environment. I tried various different schemes to run what I was doing in the confines of my laptop and I couldn’t make what I was working on actually achieve my goals. So… I bought a Raspberry Pi Zero W and a “Solderless Zero Dongle“, with the intention of running Docker containers on it… unfortunately, while Docker runs on a Pi Zero, it’s really hard to find base images for the ARMv6/armhf platform that the Pi Zero W… so I put it back in the drawer, and left it there.
Roll forwards a month or so, and I was doing some experiments with Nebula, and only had an old Chromebook to test it on… except, I couldn’t install the Nebula client for Linux on there, and the Android client wouldn’t give me some features I wanted… so I broke out that old Pi Zero W again…
Now, while the tests with Nebula I was working towards will be documented later, I found that a lot of the documentation about using a Raspberry Pi Zero as a USB gadget were rough and unexplained. So, this post breaks down much of the content of what I found, what I tried, and what did and didn’t work.
Late Edit 2021-06-04: I spotted some typos around providing specific DHCP options for interfaces, based on work I’m doing elsewhere with this script. I’ve updated these values accordingly. I’ve also created a specific branch for this revision.
Late Edit 2021-06-06: I’ve noticed this document doesn’t cover IPv6 at all right now. I started to perform some tweaks to cover IPv6, but as my ISP has decided not to bother with IPv6, and won’t support Hurricane Electric‘s Tunnelbroker system, I can’t test any of it, without building out an IPv6 test environment… maybe soon, eh?
I have been playing again, recently, with Nebula, an Open Source Peer-to-Peer VPN product which boasts speed, simplicity and in-built firewalling. Although I only have a few nodes to play with (my VPS, my NAS, my home server and my laptop), I still wanted to simplify, for me, the process of onboarding devices. So, naturally, I spent a few evenings writing a bash script that helps me to automate the creation of my Nebula nodes.
Nebula Certificates
Nebula have implemented their own certificate structure. It’s similar to an x509 “TLS Certificate” (like you’d use to access an HTTPS website, or to establish an OpenVPN connection), but has a few custom fields.
In this context, I’ve created a nebula Certificate Authority (CA), using this command:
nebula-cert ca -name nebula.example.org -ips 192.0.2.0/24,198.51.100.0/24,203.0.113.0/24 -groups Mobile,Workstation,Server,Lighthouse,db
So, what does this do?
Well, it creates the certificate and private key files, storing the name for the CA as “nebula.example.org” (there’s a reason for this!) and limiting the subnets and groups (like AWS or Azure Tags) the CA can issue certificates with.
Here, I’ve limited the CA to only issue IP addresses in the RFC5737 “Documentation” ranges, which are 192.0.2.0/24, 198.51.100.0/24 and 203.0.113.0/24, but this can easily be expanded to 10.0.0.0/8 or lots of individual subnets (I tested, and proved 1026 separate subnets which worked fine).
Groups, in Nebula parlance, are building blocks of the Security product, and can act like source or destination filters. In this case, I limited the CA to only being allowed to issue certificates with the groups of “Mobile”, “Workstation”, “Server”, “Lighthouse” and “db”.
As this certificate authority requires no internet access, and only enough access to read and write files, I have created my Nebula CA server on a separate Micro SD card to use with a Raspberry Pi device, and this is used only to generate a new CA certificate each 6 months (in theory, I’ve not done this part yet!), and to sign keys for all the client devices as they come on board.
I copy the ca.crt file to my target machines, and then move on to creating my client certificates
Client Certificates
When you generate key materials for Public Key Cryptographic activities (like this one), you’re supposed to generate the private key on the source device, and the private key should never leave the device on which it’s generated. Nebula allows you to do this, using the nebula-cert command again. That command looks like this:
If you notice, there’s a key difference at this point between Nebula’s key signing routine, and an x509 TLS style certificate, you see, this stage would be called a “Certificate Signing Request” or CSR in TLS parlance, and it usually would specify the record details for the certificate (normally things like “region”, “organisational unit”, “subject name” and so on) before sending it to the CA for signing (marking it as trusted).
In the Nebula world, you create a key, and send the public part of that (in this case, “host.pub” but it can have any name you like) to the CA, at which point the CA defines what IP addresses it will have, what groups it is in, and so on, so let’s do that.
Let’s pick apart these options, shall we? The first four flags “-ca-crt“, “-ca-key“, “-in-pub” and “-out-crt” all refer to the CSR process – it’s reading the CA certificate and key, as well as the public part of the keypair created for the process, and then defines what the output certificate will be called. The next switch, -groups, identifies the tags we’re assigning to this node, then (the mandatory flag) -ip sets the IP address allocated to the node. Note that the certificate is using one of the valid group names, and has been allocated a valid IP address address in the ranges defined above. If you provide a value for the certificate which isn’t valid, you’ll get a warning message.
In the above screenshot, I’ve bypassed the key generation and asked for the CA to sign with values which don’t match the constraints.
The last part is the name of the certificate. This is relevant because Nebula has a DNS service which can resolve the Nebula IPs to the hostnames assigned on the Certificates.
Anyway… Now that we know how to generate certificates the “hard” way, let’s make life a bit easier for you. I wrote a little script – Nebula Cert Maker, also known as certmaker.sh.
certmaker.sh
So, what does certmaker.sh do that is special?
It auto-assigns an IP address, based on the MD5SUM of the FQDN of the node. It uses (by default) the first CIDR mask (the IP range, written as something like 192.0.2.0/24) specified in the CA certificate. If multiple CIDR masks are specified in the certificate, there’s a flag you can use to select which one to use. You can override this to get a specific increment from the network address.
It takes the provided name (perhaps webserver) and adds, as a suffix, the name of the CA Certificate (like nebula.example.org) to the short name, to make the FQDN. This means that you don’t need to run a DNS service for support staff to access machines (perhaps you’ll have webserver1.nebula.example.org and webserver2.nebula.example.org as well as database.nebula.example.org).
Three “standard” roles have been defined for groups, these are “Server”, “Workstation” and “Lighthouse” [1] (the latter because you can configure Lighthouses to be the DNS servers mentioned in step 2.) Additional groups can also be specified on the command line.
[1] A lighthouse, in Nebula terms, is a publically accessible node, either with a static IP, or a DNS name which resolves to a known host, that can help other nodes find each other. Because all the nodes connect to it (or a couple of “it”s) this is a prime place to run the DNS server, as, well, it knows where all the nodes are!
So, given these three benefits, let’s see these in a script. This script is (at least currently) at the end of the README file in that repo.
# Create the CA
mkdir -p /tmp/nebula_ca
nebula-cert ca -out-crt /tmp/nebula_ca/ca.crt -out-key /tmp/nebula_ca/ca.key -ips 192.0.2.0/24,198.51.100.0/24 -name nebula.example.org
# First lighthouse, lighthouse1.nebula.example.org - 192.0.2.1, group "Lighthouse"
./certmaker.sh --cert_path /tmp/nebula_ca --name lighthouse1 --ip 1 --lighthouse
# Second lighthouse, lighthouse2.nebula.example.org - 192.0.2.2, group "Lighthouse"
./certmaker.sh -c /tmp/nebula_ca -n lighthouse2 -i 2 -l
# First webserver, webserver1.nebula.example.org - 192.0.2.168, groups "Server" and "web"
./certmaker.sh --cert_path /tmp/nebula_ca --name webserver1 --server --group web
# Second webserver, webserver2.nebula.example.org - 192.0.2.191, groups "Server" and "web"
./certmaker.sh -c /tmp/nebula_ca -n webserver2 -s -g web
# Database Server, db.nebula.example.org - 192.0.2.182, groups "Server" and "db"
./certmaker.sh --cert_path /tmp/nebula_ca --name db --server --group db
# First workstation, admin1.nebula.example.org - 198.51.100.205, group "Workstation"
./certmaker.sh --cert_path /tmp/nebula_ca --index 1 --name admin1 --workstation
# Second workstation, admin2.nebula.example.org - 198.51.100.77, group "Workstation"
./certmaker.sh -c /tmp/nebula_ca -d 1 -n admin2 -w
# First Mobile device - Create the private/public key pairing first
nebula-cert keygen -out-key mobile1.key -out-pub mobile1.pub
# Then sign it, mobile1.nebula.example.org - 198.51.100.217, group "mobile"
./certmaker.sh --cert_path /tmp/nebula_ca --index 1 --name mobile1 --group mobile --public mobile1.pub
# Second Mobile device - Create the private/public key pairing first
nebula-cert keygen -out-key mobile2.key -out-pub mobile2.pub
# Then sign it, mobile2.nebula.example.org - 198.51.100.22, group "mobile"
./certmaker.sh -c /tmp/nebula_ca -d 1 -n mobile2 -g mobile -p mobile2.pub
Technically, the mobile devices are simulating the local creation of the private key, and the sharing of the public part of that key. It also simulates what might happen in a more controlled environment – not where everything is run locally.
So, let’s pick out some spots where this content might be confusing. I’ve run each type of invocation twice, once with the short version of all the flags (e.g. -c instead of --cert_path, -n instead of --name) and so on, and one with the longer versions. Before each ./certmaker.sh command, I’ve added a comment, showing what the hostname would be, the IP address, and the Nebula Groups assigned to that node.
It is also possible to override the FQDN with your own FQDN, but this command option isn’t in here. Also, if the CA doesn’t provide a CIDR mask, one will be selected for you (10.44.88.0/24), or you can provide one with the -b/--subnet flag.
If the CA has multiple names (e.g. nebula.example.org and nebula.example.com), then the name for the host certificates will be host.nebula.example.org and also host.nebula.example.com.
Using Bash
So, if you’ve looked at, well, almost anything on my site, you’ll see that I like to use tools like Ansible and Terraform to deploy things, but for something which is going to be run on this machine, I’d like to keep things as simple as possible… and there’s not much in this script that needed more than what Bash offers us.
For those who don’t know, bash is the default shell for most modern Linux distributions and Docker containers. It can perform regular expression parsing (checking that strings, or specific collections of characters appear in a variable), mathematics, and perform extensive loop and checks on values.
So, take a look at the internals of the script, if you want to know some options on writing bash scripts that manipulate IP addresses and read the output of files!
If you’re looking for some simple tasks to start your portfolio of work, there are some “good first issue” tasks in the “issues” of the repo, and I’d be glad to help you work through them.
Wrap up
I hope you enjoy using this script, and I hope, if you’re planning on writing some bash scripts any time soon, that you take a look over the code and consider using some of the templates I reference.
I tend to write long and overly complicated set_fact statements in Ansible, ALL THE DAMN TIME. I write stuff like this:
rulebase: |
{
{% for var in vars | dict2items %}
{% if var.key | regex_search(regex_rulebase_match) | type_debug != "NoneType"
and (
var.value | type_debug == "dict"
or var.value | type_debug == "AnsibleMapping"
) %}
{% for item in var.value | dict2items %}
{% if item.key | regex_search(regex_rulebase_match) | type_debug != "NoneType"
and (
item.value | type_debug == "dict"
or item.value | type_debug == "AnsibleMapping"
) %}
"{{ var.key | regex_replace(regex_rulebase_match, '\2') }}{{ item.key | regex_replace(regex_rulebase_match, '\2') }}": {
{# This block is used for rulegroup level options #}
{% for key in ['log_from_start', 'log', 'status', 'nat', 'natpool', 'schedule', 'ips_enable', 'ssl_ssh_profile', 'ips_sensor'] %}
{% if var.value[key] is defined and rule.value[key] is not defined %}
{% if var.value[key] | type_debug in ['string', 'AnsibleUnicode'] %}
"{{ key }}": "{{ var.value[key] }}",
{% else %}
"{{ key }}": {{ var.value[key] }},
{% endif %}
{% endif %}
{% endfor %}
{% for rule in item.value | dict2items %}
{% if rule.key in ['sources', 'destinations', 'services', 'src_internet_service', 'dst_internet_service'] and rule.value | type_debug not in ['list', 'AnsibleSequence'] %}
"{{ rule.key }}": ["{{ rule.value }}"],
{% elif rule.value | type_debug in ['string', 'AnsibleUnicode'] %}
"{{ rule.key }}": "{{ rule.value }}",
{% else %}
"{{ rule.key }}": {{ rule.value }},
{% endif %}
{% endfor %}
},
{% endif %}
{% endfor %}
{% endif %}
{% endfor %}
}
Now, if you’re writing set_fact or vars like this a lot, what you tend to end up with is the dreaded dict2items requires a dictionary, got instead. which basically means “Hah! You wrote a giant blob of what you thought was JSON, but didn’t render right, so we cast it to a string for you!”
The way I usually write my playbooks, I’ll do something with this set_fact at line, let’s say, 10, and then use it at line, let’s say, 500… So, I don’t know what the bloomin’ thing looks like then!
- name: Type Check - is_a_string
assert:
quiet: yes
that:
- vars[this_key] is not boolean
- vars[this_key] is not number
- vars[this_key] | int | string != vars[this_key] | string
- vars[this_key] | float | string != vars[this_key] | string
- vars[this_key] is string
- vars[this_key] is not mapping
- vars[this_key] is iterable
success_msg: "{{ this_key }} is a string"
fail_msg: |-
{{ this_key }} should be a string, and is instead
{%- if vars[this_key] is not defined %} undefined
{%- else %} {{ vars[this_key] is boolean | ternary(
'a boolean',
(vars[this_key] | int | string == vars[this_key] | string) | ternary(
'an integer',
(vars[this_key] | float | string == vars[this_key] | string) | ternary(
'a float',
vars[this_key] is string | ternary(
'a string',
vars[this_key] is mapping | ternary(
'a dict',
vars[this_key] is iterable | ternary(
'a list',
'unknown (' ~ vars[this_key] | type_debug ~ ')'
)
)
)
)
)
)}}{% endif %} - {{ vars[this_key] | default('unset') }}
I hope this helps you, bold traveller with complex jinja2 templating requirements!
(Oh, and if you get “template error while templating string: no test named 'boolean'“, you’re probably running Ansible which you installed using apt from Ubuntu Universe, version 2.9.6+dfsg-1 [or, at least I was!] – to fix this, use pip to install a more recent version – preferably using virtualenv first!)
My current Ansible project relies on me collecting a lot of data from AWS and then checking it again later, to see if something has changed.
This is great for one-off tests (e.g. terraform destroy ; terraform apply ; ansible-playbook run.yml) but isn’t great for repetitive tests, especially if you have to collect data that may take many minutes to run all the actions, or if you have slow or unreliable internet in your development environment.
To get around this, I wrote a wrapper for caching this data.
At the top of my playbook, run.yml, I have these tasks:
- name: Set Online Status.
# This stores the value of run_online, unless run_online
# is not set, in which case, it defines it as "true".
ansible.builtin.set_fact:
run_online: |-
{{- run_online | default(true) | bool -}}
- name: Create cache_data path.
# This creates a "cached_data" directory in the same
# path as the playbook.
when: run_online | bool and cache_data | default(false) | bool
delegate_to: localhost
run_once: true
file:
path: "cached_data"
state: directory
mode: 0755
- name: Create cache_data for host.
# This creates a directory under "cached_data" in the same
# path as the playbook, with the name of each of the inventory
# items.
when: run_online | bool and cache_data | default(false) | bool
delegate_to: localhost
file:
path: "cached_data/{{ inventory_hostname }}"
state: directory
mode: 0755
Running this sets up an expectation for the normal operation of the playbook, that it will be “online”, by default.
Then, every time I need to call something “online”, for example, collect EC2 Instance Data (using the community.aws.ec2_instance_info module), I call out to (something like) this set of tasks, instead of just calling the task by itself.
- name: List all EC2 instances in the regions of interest.
when: run_online | bool
community.aws.ec2_instance_info:
region: "{{ item.region_name }}"
loop: "{{ regions }}"
loop_control:
label: "{{ item.region_name }}"
register: regional_ec2
- name: "NOTE: Set regional_ec2 data path"
when: not run_online | bool or cache_data | default(false) | bool
set_fact:
regional_ec2_cached_data_file_loop: "{{ regional_ec2_cached_data_file_loop | default(0) | int + 1 }}"
cached_data_filename: "cached_data/{{ inventory_hostname }}/{{ cached_data_file | default('regional_ec2') }}.{{ regional_ec2_cached_data_file_loop | default(0) | int + 1 }}.json"
- name: "NOTE: Cache/Get regional_ec2 data path"
when: not run_online | bool or cache_data | default(false) | bool
debug:
msg: "File: {{ cached_data_filename }}"
- name: Cache all EC2 instances in the regions of interest.
when: run_online | bool and cache_data | default(false) | bool
delegate_to: localhost
copy:
dest: "{{ cached_data_filename }}"
mode: "0644"
content: "{{ regional_ec2 }}"
- name: "OFFLINE: Load all EC2 instances in the regions of interest."
when: not run_online | bool
set_fact:
regional_ec2: "{% include( cached_data_filename ) %}"
The first task, if it’s still set to being “online” will execute the task, and registers the result for later. If cache_data is configured, we generate a filename for the caching, record the filename to the log (via the debug task) and then store it (using the copy task). So far, so online… but what happens when we don’t need the instance to be up and running?
In that case, we use the set_fact module, triggered by running the playbook like this: ansible-playbook run.yml -e run_online=false. This reads the cached data out of that locally stored pool of data for later use.
I recently got a new laptop, and for various reasons, I’m going to be primarily running Windows on that laptop. However, I still like having a working SSH server, running in the context of my Windows Subsystem for Linux (WSL) environment.
Initially, trying to run service ssh start failed with an error, because you need to re-execute the ssh configuration steps which are missed in a WSL environment. To fix that, run sudo apt install --reinstall openssh-server.
Once you know your service runs OK, you start digging around to find out how to start it on boot, and you’ll see lots of people saying things like “Just run a shell script that starts your first service, and then another shell script for the next service.”
Well, the frustration for me is that Linux already has this capability – the current popular version is called SystemD, but a slightly older variant is still knocking around in modern linux distributions, and it’s called SystemV Init, often referred to as just “sysv” or “init.d”.
The way that those services work is that you have an “init” file in /etc/init.d and then those files have a symbolic link into a “runlevel” directory, for example /etc/rc3.d. Each symbolic link is named S##service or K##service, where the ## represents the order in which it’s to be launched. The SSH Daemon, for example, that I want to run is created in there as /etc/rc3.d/S01ssh.
So, how do I make this work in the grander scheme of WSL? I can’t use SystemD, where I could say systemctl enable --now ssh, instead I need to add a (yes, I know) shell script, which looks in my desired runlevel directory. Runlevel 3 is the level at which network services have started, hence using that one. If I was trying to set up a graphical desktop, I’d instead be looking to use Runlevel 5, but the X Windows system isn’t ported to Windows like that yet… Anyway.
Because the rc#.d directory already has this structure for ordering and naming services to load, I can just step over this directory looking for files which match or do not match the naming convention, and I do that with this script:
#! /bin/bash
function run_rc() {
base="$(basename "$1")"
if [[ ${base:0:1} == "S" ]]
then
"$1" start
else
"$1" stop
fi
}
if [ "$1" != "" ] && [ -e "$1" ]
then
run_rc "$1"
else
rc=3
if [ "$1" != "" ] && [ -e "/etc/rc${$1}.d/" ]
then
rc="$1"
fi
for digit1 in {0..9}
do
for digit2 in {0..9}
do
find "/etc/rc${rc}.d/" -name "[SK]${digit1}${digit2}*" -exec "$0" '{}' \; 2>/dev/null
done
done
fi
I’ve put this script in /opt/wsl_init.sh
This does a bit of trickery, but basically runs the bottom block first. It loops over the digits 0 to 9 twice (giving you 00, 01, 02 and so on up to 99) and looks in /etc/rc3.d for any file containing the filename starting S or K and then with the two digits you’ve looped to by that point. Finally, it runs itself again, passing the name of the file it just found, and this is where the top block comes in.
In the top block we look at the “basename” – the part of the path supplied, without any prefixed directories attached, and then extract just the first character (that’s the ${base:0:1} part) to see whether it’s an “S” or anything else. If it’s an S (which everything there is likely to be), it executes the task like this: /etc/rc3.d/S01ssh start and this works because it’s how that script is designed! You can run one of the following instances of this command: service ssh start, /etc/init.d/ssh start or /etc/rc3.d/S01ssh start. There are other options, notably “stop” or “status”, but these aren’t really useful here.
Now, how do we make Windows execute this on boot? I’m using NSSM, the “Non-sucking service manager” to add a line to the Windows System services. I placed the NSSM executable in C:\Program Files\nssm\nssm.exe, and then from a command line, ran C:\Program Files\nssm\nssm.exe install WSL_Init.
I configured it with the Application Path: C:\Windows\System32\wsl.exe and the Arguments: -d ubuntu -e sudo /opt/wsl_init.sh. Note that this only works because I’ve also got Sudo setup to execute this command without prompting for a password.
And then I rebooted. SSH was running as I needed it.
This is a quick note, having stumbled over this one today.
Mostly these days, I’m used to using Terraform to create Elastic IP (EIP) items in AWS, and I can assign tags to them during creation. For various reasons in $Project I’m having to create my EIPs in Ansible.
To make this work, you can’t just create an EIP with tags (like you would in Terraform), instead what you need to do is to create the EIP and then tag it, like this:
- name: Allocate a new elastic IP
community.aws.ec2_eip:
state: present
in_vpc: true
region: eu-west-1
register: eip
- name: Tag that resource
amazon.aws.ec2_tag:
region: eu-west-1
resource: "{{ eip.allocation_id }}"
state: present
tags:
Name: MyTag
register: tag
Notice that we create a VPC associated EIP, and assign the allocation_id from the result of that module to the resource we want to tag.
How about if you’re trying to be a bit more complex?
Here I have a list of EIPs I want to create, and then I pass this into the ec2_eip module, like this:
So, in this instance we pass the list of EIP names we want to create as a list with the loop instruction. Now, at the point we create them, we don’t actually know what they’ll be called, but we’re naming them there because when we tag them, we get the “item” (from the loop) that was used to create the EIP. When we then tag the EIP, we can use some of the data that was returned from the ec2_eip module (region, EIP allocation ID and the name we used as the loop key). I’ve trimmed out the debug statements I created while writing this, but here’s what you get back from ec2_eip:
For a project I’m working on, I needed to define a list of ports, and set some properties on some of them. In the Ansible world, you’d use statements like:
{% if data.somekey is defined %}something {{ data.somekey }}{% endif %}
or
{{ data.somekey | default('') }}
In a pinch, you can also do this:
{{ (data | default({}) ).somekey | default('') }}
With Terraform, I was finding it much harder to work out how to find whether a value as part of a map (the Terraform term for a Dictionary in Ansible terms, or an Associative Array in PHP terms), until I stumbled over the Lookup function. Here’s how that looks for just a simple Terraform file:
In a project I’m working on in Terraform, I’ve got several feature flags in a module. These flags relate to whether this module should turn on a system in a cloud provider, or not, and looks like this:
variable "turn_on_feature_x" {
description = "Setting this to 'yes' will enable Feature X. Any other value will disable it. (Default 'yes')"
value = "yes"
}
variable "turn_on_feature_y" {
description = "Setting this to 'yes' will enable Feature Y. Any other value will disable it. (Default 'no')"
value = "no"
}
When I call the module, I then can either leave the feature with the default values, or selectively enable or disable them, like this:
When I then want to use the feature, I have to remember a couple of key parts.
Normally this feature check is done with a “count” statement, and the easiest way to use this is to use the ternary operator to check values and return a “1” or a “0” for if you want the value used.
Ternary operators look like this: var.turn_on_feature_x == "yes" ? 1 : 0 which basically means, if the value of the variable turn_on_feature_x is set to “yes”, then return 1 otherwise return 0.
This can get a bit complex, particularly if you want to check several flags a few times, like this: var.turn_on_feature_x == "yes" ? var.turn_on_feature_y == "yes" ? 1 : 0 : 0. I’ve found that wrapping them in brackets helps to understand what you’re getting, like this:
If you end up using a count statement, the resulting value must be treated as an 0-indexed array, like this: some_provider_service.my_name[0].result
This is because, using the count value says “I want X number of resources”, so Terraform has to treat it as an array, in case you actually wanted 10 instead of 1 or 0.
One of the things I’m currently playing with is a project to deploy some FortiGate Firewalls into cloud platforms. I have a couple of Evaluation Licenses I can use (as we’re a partner), but when it comes to automatically scaling, you need to use the PAYG license.
To try to keep my terraform files as reusable as possible, I came up with this work around. It’s likely to be useful in other places too. Enjoy!
This next block is stored in license.tf and basically says “by default, you have no license.”
variable "license_file" {
default = ""
description = "Path to the license file to load, or leave blank to use a PAYG license."
}
We can either override this with a command line switch terraform apply -var 'license_file=mylicense.lic', or (more likely) the above override file named license_override.tf (ignored in Git) which has this next block in it:
This next block is also stored in license.tf and says “If var.license is not empty, load that license file [var.license != "" ? var.license] but if it is empty, check whether /dev/null exists (*nix platforms) [fileexists("/dev/null")] in which case, use /dev/null, otherwise use the NUL: device (Windows platforms).”
👉 Just as an aside, I’ve seen this “ternary” construct in a few languages. It basically looks like this: boolean_operation ? true_value : false_value
That check, logically, could have been written like this instead: "%{if boolean_operation}${true_value}%{else}${false_value}%{endif}"
By combining two of these together, while initially it looks far more messy and hard to parse, I’ve found that, especially in single-line statements, it’s much more compact and eventually easier to read than the alternative if/else/endif structure.
So, this means that we can now refer to data.local_file.license as our data source.
Next, I want to select either the PAYG (Pay As You Go) or BYOL (Bring Your Own License) licensed AMI in AWS (the same principle applies in Azure, GCP, etc), so in this block we provide a different value to the filter in the AMI Data Source, suggesting the string “FortiGate-VM64-AWS *x.y.z*” if we have a value provided license, or “FortiGate-VM64-AWSONDEMAND *x.y.z*” if we don’t.
And so that is how I can elect to provide a license, or use a pre-licensed image from AWS, and these lessons can also be applied in an Azure or GCP environment too.
TL;DR?It’s possible to work out what type of variable you’re working with in Ansible. The in-built filters don’t always do quite what you’re expecting. Jump to the “In Summary” heading for my suggestions.
LATE EDIT: 2021-05-23 After raising a question in #ansible on Freenode, flowerysong noticed that my truth table around mappings, iterables and strings was wrong. I’ve amended the table accordingly, and have added a further note below the table.
One of the things I end up doing quite a bit with Ansible is value manipulation. I know it’s not really normal, but… well, I like rewriting values from one type of a thing to the next type of a thing.
For example, I like taking a value that I don’t know if it’s a list or a string, and passing that to an argument that expects a list.
Doing it wrong, getting it better
Until recently, I’d do that like this:
- debug:
msg: |-
{
{%- if value | type_debug == "string" or value | type_debug == "AnsibleUnicode" -%}
"string": "{{ value }}"
{%- elif value | type_debug == "dict" or value | type_debug == "ansible_mapping" -%}
"dict": {{ value }}
{%- elif value | type_debug == "list" -%}
"list": {{ value }}
{%- else -%}
"other": "{{ value }}"
{%- endif -%}
}
But, following finding this gist, I now know I can do this:
- debug:
msg: |-
{
{%- if value is string -%}
"string": "{{ value }}"
{%- elif value is mapping -%}
"dict": {{ value }}
{%- elif value is iterable -%}
"list": {{ value }}
{%- else -%}
"other": "{{ value }}"
{%- endif -%}
}
So, how would I use this, given the context of what I was saying before?
- assert:
that:
- value is string
- value is not mapping
- value is iterable
- some_module:
some_arg: |-
{%- if value is string -%}
["{{ value }}"]
{%- else -%}
{{ value }}
{%- endif -%}
More details on finding a type
Why in this order? Well, because of how values are stored in Ansible, the following states are true:
⬇️Type \ ➡️Check
is iterable
is mapping
is sequence
is string
a_dict (e.g. {})
✔️
✔️
✔️
❌
a_list (e.g. [])
✔️
❌
✔️
❌
a_string (e.g. “”)
✔️
❌
✔️
✔️
A comparison between value types
So, if you were to check for is iterable first, you might match on a_list or a_dict instead of a_string, but string can only match on a_string. Once you know it can’t be a string, you can check whether something is mapping – again, because a mapping can only match a_dict, but it can’t match a_list or a_string. Once you know it’s not that, you can check for either is iterable or is sequence because both of these match a_string, a_dict and a_list.
LATE EDIT: 2021-05-23 Note that a prior revision of this table and it’s following paragraph showed “is_mapping” as true for a_string. This is not correct, and has been fixed, both in the table and the paragraph.
Likewise, if you wanted to check whether a_float and an_integeris number and not is string, you can check these:
⬇️Type \ ➡️Check
is float
is integer
is iterable
is mapping
is number
is sequence
is string
a_float
✔️
❌
❌
❌
✔️
❌
❌
an_integer
❌
✔️
❌
❌
✔️
❌
❌
A comparison between types of numbers
So again, a_float and an_integer don’t match is string, is mapping or is iterable, but they both match is number and they each match their respective is float and is integer checks.
How about each of those (a_float and an_integer) wrapped in quotes, making them a string? What happens then?
⬇️Type \ ➡️Check
is float
is integer
is iterable
is mapping
is number
is sequence
is string
a_float_as_string
❌
❌
✔️
❌
❌
✔️
✔️
an_integer_as_string
❌
❌
✔️
❌
❌
✔️
✔️
A comparison between types of numbers when held as a string
This is somewhat interesting, because they look like a number, but they’re actually “just” a string. So, now you need to do some comparisons to make them look like numbers again to check if they’re numbers.
Changing the type of a string
What happens if you cast the values? Casting means to convert from one type of value (e.g. string) into another (e.g. float) and to do that, Ansible has three filters we can use, float, int and string. You can’t cast to a dict or a list, but you can use dict2items and items2dict (more on those later). So let’s start with casting our group of a_ and an_ items from above. Here’s a list of values I want to use:
With each of these values, I returned the value as Ansible knows it, what happens when you do {{ value | float }} to cast it as a float, as an integer by doing {{ value | int }} and as a string {{ value | string }}. Some of these results are interesting. Note that where you see u'some value' means that Python converted that string to a Unicode string.
⬇️Value \ ➡️Cast
value
value when cast as float
value when cast as integer
value when cast as string
a_dict
{“key1”: “value1”}
0.0
0
“{u’key1′: u’value1′}”
a_float
1.1
1.1
1
“1.1”
a_float_as_string
“1.1”
1.1
1
“1.1”
a_list
[“item1”]
0.0
0
“[u’item1′]”
a_string
“string”
0.0
0
“string”
an_int
1
1
1
“1”
an_int_as_string
“1”
1
1
“1”
Casting between value types
So, what does this mean for us? Well, not a great deal, aside from to note that you can “force” a number to be a string, or a string which is “just” a number wrapped in quotes can be forced into being a number again.
Oh, and casting dicts to lists and back again? This one is actually pretty clearly documented in the current set of documentation (as at 2.9 at least!)
Checking for miscast values
How about if I want to know whether a value I think might be a float stored as a string, how can I check that?
What is this? If I cast a value that I think might be a float, to a float, and then turn both the cast value and the original into a string, do they match? If I’ve got a string or an integer, then I’ll get a false, but if I have actually got a float, then I’ll get true. Likewise for casting an integer. Let’s see what that table looks like:
⬇️Type \ ➡️Check
value when cast as float
value when cast as integer
value when cast as string
a_float
✔️
❌
✔️
a_float_as_string
✔️
❌
✔️
an_integer
❌
✔️
✔️
an_integer_as_string
❌
✔️
✔️
A comparison between types of numbers when cast to a string
So this shows us the values we were after – even if you’ve got a float (or an integer) stored as a string, by doing some careful casting, you can confirm they’re of the type you wanted… and then you can pass them through the right filter to use them in your playbooks!
Booleans
Last thing to check – boolean values – “True” or “False“. There’s a bit of confusion here, as a “boolean” can be: true or false, yes or no, 1 or 0, however, is true and True and TRUE the same? How about false, False and FALSE? Let’s take a look!
⬇️Value \ ➡️Check
type_debug
is boolean
is number
is iterable
is mapping
is string
value when cast as bool
value when cast as string
value when cast as integer
yes
bool
✔️
✔️
❌
❌
❌
True
True
1
Yes
AnsibleUnicode
❌
❌
✔️
❌
✔️
False
Yes
0
YES
AnsibleUnicode
❌
❌
✔️
❌
✔️
False
YES
0
“yes”
AnsibleUnicode
❌
❌
✔️
❌
✔️
True
yes
0
“Yes”
AnsibleUnicode
❌
❌
✔️
❌
✔️
True
Yes
0
“YES”
AnsibleUnicode
❌
❌
✔️
❌
✔️
True
YES
0
true
bool
✔️
✔️
❌
❌
❌
True
True
1
True
bool
✔️
✔️
❌
❌
❌
True
True
1
TRUE
bool
✔️
✔️
❌
❌
❌
True
True
1
“true”
AnsibleUnicode
❌
❌
✔️
❌
✔️
True
true
0
“True”
AnsibleUnicode
❌
❌
✔️
❌
✔️
True
True
0
“TRUE”
AnsibleUnicode
❌
❌
✔️
❌
✔️
True
TRUE
0
1
int
❌
✔️
❌
❌
❌
True
1
1
“1”
AnsibleUnicode
❌
❌
✔️
❌
✔️
True
1
1
no
bool
✔️
✔️
❌
❌
❌
False
False
0
No
bool
✔️
✔️
❌
❌
❌
False
False
0
NO
bool
✔️
✔️
❌
❌
❌
False
False
0
“no”
AnsibleUnicode
❌
❌
✔️
❌
✔️
False
no
0
“No”
AnsibleUnicode
❌
❌
✔️
❌
✔️
False
No
0
“NO”
AnsibleUnicode
❌
❌
✔️
❌
✔️
False
NO
0
false
bool
✔️
✔️
❌
❌
❌
False
False
0
False
bool
✔️
✔️
❌
❌
❌
False
False
0
FALSE
bool
✔️
✔️
❌
❌
❌
False
False
0
“false”
AnsibleUnicode
❌
❌
✔️
❌
✔️
False
false
0
“False”
AnsibleUnicode
❌
❌
✔️
❌
✔️
False
False
0
“FALSE”
AnsibleUnicode
❌
❌
✔️
❌
✔️
False
FALSE
0
0
int
❌
✔️
❌
❌
❌
False
0
0
“0”
AnsibleUnicode
❌
❌
✔️
❌
✔️
False
0
0
Comparisons between various stylings of boolean representations
So, the stand out thing for me here is that while all the permutations of string values of the boolean representations (those wrapped in quotes, like this: "yes") are treated as strings, and shouldn’t be considered as “boolean” (unless you cast for it explicitly!), and all non-string versions of true, false, and no are considered to be boolean, yes, Yes and YES are treated differently, depending on case. So, what would I do?
In summary
Consistently use no or yes, true or false in lower case to indicate a boolean value. Don’t use 1 or 0 unless you have to.
If you’re checking that you’re working with a string, a list or a dict, check in the order string (using is string), dict (using is mapping) and then list (using is sequence or is iterable)
Checking for numbers that are stored as strings? Cast your string through the type check for that number, like this: {% if value | float | string == value | string %}{{ value | float }}{% elif value | int | string == value | string %}{{ value | int }}{% else %}{{ value }}{% endif %}
Try not to use type_debug unless you really can’t find any other way. These values will change between versions, and this caused me a lot of issues with a large codebase I was working on a while ago!
Run these tests yourself!
Want to run these tests yourself? Here’s the code I ran (also available in a Gist on GitHub), using Ansible 2.9.10.
---
- hosts: localhost
gather_facts: no
vars:
an_int: 1
a_float: 1.1
a_string: "string"
an_int_as_string: "1"
a_float_as_string: "1.1"
a_list:
- item1
a_dict:
key1: value1
tasks:
- debug:
msg: |
{
{% for var in ["an_int", "an_int_as_string","a_float", "a_float_as_string","a_string","a_list","a_dict"] %}
"{{ var }}": {
"type_debug": "{{ vars[var] | type_debug }}",
"value": "{{ vars[var] }}",
"is float": "{{ vars[var] is float }}",
"is integer": "{{ vars[var] is integer }}",
"is iterable": "{{ vars[var] is iterable }}",
"is mapping": "{{ vars[var] is mapping }}",
"is number": "{{ vars[var] is number }}",
"is sequence": "{{ vars[var] is sequence }}",
"is string": "{{ vars[var] is string }}",
"value cast as float": "{{ vars[var] | float }}",
"value cast as integer": "{{ vars[var] | int }}",
"value cast as string": "{{ vars[var] | string }}",
"is same when cast to float": "{{ vars[var] | float | string == vars[var] | string }}",
"is same when cast to integer": "{{ vars[var] | int | string == vars[var] | string }}",
"is same when cast to string": "{{ vars[var] | string == vars[var] | string }}",
},
{% endfor %}
}
---
- hosts: localhost
gather_facts: false
vars:
# true, True, TRUE, "true", "True", "TRUE"
a_true: true
a_true_initial_caps: True
a_true_caps: TRUE
a_string_true: "true"
a_string_true_initial_caps: "True"
a_string_true_caps: "TRUE"
# yes, Yes, YES, "yes", "Yes", "YES"
a_yes: yes
a_yes_initial_caps: Tes
a_yes_caps: TES
a_string_yes: "yes"
a_string_yes_initial_caps: "Yes"
a_string_yes_caps: "Yes"
# 1, "1"
a_1: 1
a_string_1: "1"
# false, False, FALSE, "false", "False", "FALSE"
a_false: false
a_false_initial_caps: False
a_false_caps: FALSE
a_string_false: "false"
a_string_false_initial_caps: "False"
a_string_false_caps: "FALSE"
# no, No, NO, "no", "No", "NO"
a_no: no
a_no_initial_caps: No
a_no_caps: NO
a_string_no: "no"
a_string_no_initial_caps: "No"
a_string_no_caps: "NO"
# 0, "0"
a_0: 0
a_string_0: "0"
tasks:
- debug:
msg: |
{
{% for var in ["a_true","a_true_initial_caps","a_true_caps","a_string_true","a_string_true_initial_caps","a_string_true_caps","a_yes","a_yes_initial_caps","a_yes_caps","a_string_yes","a_string_yes_initial_caps","a_string_yes_caps","a_1","a_string_1","a_false","a_false_initial_caps","a_false_caps","a_string_false","a_string_false_initial_caps","a_string_false_caps","a_no","a_no_initial_caps","a_no_caps","a_string_no","a_string_no_initial_caps","a_string_no_caps","a_0","a_string_0"] %}
"{{ var }}": {
"type_debug": "{{ vars[var] | type_debug }}",
"value": "{{ vars[var] }}",
"is float": "{{ vars[var] is float }}",
"is integer": "{{ vars[var] is integer }}",
"is iterable": "{{ vars[var] is iterable }}",
"is mapping": "{{ vars[var] is mapping }}",
"is number": "{{ vars[var] is number }}",
"is sequence": "{{ vars[var] is sequence }}",
"is string": "{{ vars[var] is string }}",
"is bool": "{{ vars[var] is boolean }}",
"value cast as float": "{{ vars[var] | float }}",
"value cast as integer": "{{ vars[var] | int }}",
"value cast as string": "{{ vars[var] | string }}",
"value cast as bool": "{{ vars[var] | bool }}",
"is same when cast to float": "{{ vars[var] | float | string == vars[var] | string }}",
"is same when cast to integer": "{{ vars[var] | int | string == vars[var] | string }}",
"is same when cast to string": "{{ vars[var] | string == vars[var] | string }}",
"is same when cast to bool": "{{ vars[var] | bool | string == vars[var] | string }}",
},
{% endfor %}
}
Featured image is “Kelvin Test” by “Eelke” on Flickr and is released under a CC-BY license.