"Stockholms Stadsbibliotek" by "dilettantiquity" on Flickr

Terraform templates with Maps

For a project I’m working on, I needed to define a list of ports, and set some properties on some of them. In the Ansible world, you’d use statements like:

{% if data.somekey is defined %}something {{ data.somekey }}{% endif %}

or

{{ data.somekey | default('') }}

In a pinch, you can also do this:

{{ (data | default({}) ).somekey | default('') }}

With Terraform, I was finding it much harder to work out how to find whether a value as part of a map (the Terraform term for a Dictionary in Ansible terms, or an Associative Array in PHP terms), until I stumbled over the Lookup function. Here’s how that looks for just a simple Terraform file:

output "test" {
    value = templatefile(
        "template.tmpl",
        {
            ports = {
                "eth0": {"ip": "192.168.1.1/24", "name": "public"}, 
                "eth1": {"ip": "172.16.1.1/24", "name": "protected"}, 
                "eth2": {"ip": "10.1.1.1/24", "name": "management", "management": true}, 
                "eth3": {}
            },
            management = "1"
        }
    )
}

And the template that goes with that?

%{ for port, data in ports ~}
Interface: ${port}%{ if lookup(data, "name", "") != ""}
Alias: ${ lookup(data, "name", "") }%{ endif }
Services: ping%{ if lookup(data, "management", false) == true } ssh https%{ endif }
IP: ${ lookup(data, "ip", "Not Defined") }

%{ endfor }

This results in the following output:

C:\tf>terraform.exe apply -auto-approve

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

test = Interface: eth0
Alias: public
Services: ping
IP: 192.168.1.1/24

Interface: eth1
Alias: protected
Services: ping
IP: 172.16.1.1/24

Interface: eth2
Alias: management
Services: ping ssh https
IP: 10.1.1.1/24

Interface: eth3
Services: ping
IP: Not Defined

Naturally, using this in your own user-data or Custom Data field will probably make more sense than just writing it to “output” 😁

Featured image is β€œStockholms Stadsbibliotek” by β€œdilettantiquity” on Flickr and is released under a CC-BY-SA license.

"Sydney Observatory I" by "Newtown grafitti" on Flickr

Using Feature Flags in Terraform with Count Statements

In a project I’m working on in Terraform, I’ve got several feature flags in a module. These flags relate to whether this module should turn on a system in a cloud provider, or not, and looks like this:

variable "turn_on_feature_x" {
  description = "Setting this to 'yes' will enable Feature X. Any other value will disable it. (Default 'yes')"
  value = "yes"
}

variable "turn_on_feature_y" {
  description = "Setting this to 'yes' will enable Feature Y. Any other value will disable it. (Default 'no')"
  value = "no"
}

When I call the module, I then can either leave the feature with the default values, or selectively enable or disable them, like this:

module "region1" {
  source = "./my_module"
}

module "region2" {
  source = "./my_module"
  turn_on_feature_x = "no"
  turn_on_feature_y = "yes"
}

module "region3" {
  source = "./my_module"
  turn_on_feature_y = "yes"
}

module "region4" {
  source = "./my_module"
  turn_on_feature_x = "no"
}

# Result:
# region1 has X=yes, Y=no
# region2 has X=no, Y=yes
# region3 has X=yes, Y=yes
# region4 has X=no, Y=no

When I then want to use the feature, I have to remember a couple of key parts.

  1. Normally this feature check is done with a “count” statement, and the easiest way to use this is to use the ternary operator to check values and return a “1” or a “0” for if you want the value used.

    Ternary operators look like this: var.turn_on_feature_x == "yes" ? 1 : 0 which basically means, if the value of the variable turn_on_feature_x is set to “yes”, then return 1 otherwise return 0.

    This can get a bit complex, particularly if you want to check several flags a few times, like this: var.turn_on_feature_x == "yes" ? var.turn_on_feature_y == "yes" ? 1 : 0 : 0. I’ve found that wrapping them in brackets helps to understand what you’re getting, like this:

    (
      var.turn_on_feature_x == "yes" ?
      (
        var.turn_on_feature_y == "yes" ?
        1 :
        0
      ) :
      0
    )
  2. If you end up using a count statement, the resulting value must be treated as an 0-indexed array, like this: some_provider_service.my_name[0].result

    This is because, using the count value says “I want X number of resources”, so Terraform has to treat it as an array, in case you actually wanted 10 instead of 1 or 0.

Here’s an example of that in use:

resource "aws_guardduty_detector" "Region" {
  count = var.enable_guardduty == "yes" ? 1 : 0
  enable = true
}

resource "aws_cloudwatch_event_rule" "guardduty_finding" {
  count = (var.enable_guardduty == "yes" ? (var.send_guardduty_findings_to_sns == "yes" ? 1 : (var.send_guardduty_findings_to_sqs == "yes" ? 1 : 0)) : 0)
  name = "${data.aws_caller_identity.current.account_id}-${data.aws_region.current.name}-${var.sns_guardduty_finding_suffix}"
  event_pattern = <<PATTERN
{
  "source": [
    "aws.guardduty"
  ],
  "detail-type": [
    "GuardDuty Finding"
  ]
}
PATTERN
}

resource "aws_cloudwatch_event_target" "sns_guardduty_finding" {
  count = (var.enable_guardduty == "yes" ? (var.send_guardduty_findings_to_sns == "yes" ? 1 : 0) : 0)
  rule = aws_cloudwatch_event_rule.guardduty_finding[0].name
  target_id = aws_sns_topic.guardduty_finding[0].name
  arn = aws_sns_topic.guardduty_finding[0].arn
}

resource "aws_cloudwatch_event_target" "sqs_guardduty_finding" {
  count = (var.enable_guardduty == "yes" ? (var.send_guardduty_findings_to_sqs == "yes" ? 1 : 0) : 0)
  rule = aws_cloudwatch_event_rule.guardduty_finding[0].name
  target_id = "SQS"
  arn = aws_sqs_queue.guardduty_finding[0].arn
}

One thing that bit me rather painfully around this was that if you change from an uncounted resource, like this:

resource "some_tool" "this" {
  some_setting = 1
}

To a counted resource, like this:

resource "some_tool" "this" {
  count = var.some_tool == "yes" ? 1 : 0
  some_setting = 1
}

Then, Terraform will promptly destroy some_tool.this to replace it with some_tool.this[0], because they’re not the same referenced thing!

Fun, huh? 😊

Featured image is β€œSydney Observatory I” by β€œNewtown grafitti” on Flickr and is released under a CC-BY license.

"Kelvin Test" by "Eelke" on Flickr

In Ansible, determine the type of a value, and casting those values to other types

TL;DR? It’s possible to work out what type of variable you’re working with in Ansible. The in-built filters don’t always do quite what you’re expecting. Jump to the “In Summary” heading for my suggestions.

One of the things I end up doing quite a bit with Ansible is value manipulation. I know it’s not really normal, but… well, I like rewriting values from one type of a thing to the next type of a thing.

For example, I like taking a value that I don’t know if it’s a list or a string, and passing that to an argument that expects a list.

Doing it wrong, getting it better

Until recently, I’d do that like this:

- debug:
    msg: |-
      {
        {%- if value | type_debug == "string" or value | type_debug == "AnsibleUnicode" -%}
           "string": "{{ value }}"
        {%- elif value | type_debug == "dict" or value | type_debug == "ansible_mapping" -%}
          "dict": {{ value }}
        {%- elif value | type_debug == "list" -%}
          "list": {{ value }}
        {%- else -%}
          "other": "{{ value }}"
        {%- endif -%}
      }

But, following finding this gist, I now know I can do this:

- debug:
    msg: |-
      {
        {%- if value is string -%}
           "string": "{{ value }}"
        {%- elif value is mapping -%}
          "dict": {{ value }}
        {%- elif value is iterable -%}
          "list": {{ value }}
        {%- else -%}
          "other": "{{ value }}"
        {%- endif -%}
      }

So, how would I use this, given the context of what I was saying before?

- assert:
    that:
    - value is string
    - value is not mapping
    - value is iterable
- some_module:
    some_arg: |-
      {%- if value is string -%}
        ["{{ value }}"]
      {%- else -%}
        {{ value }}
      {%- endif -%}

More details on finding a type

Why in this order? Well, because of how values are stored in Ansible, the following states are true:

⬇️Type \ ➑️Checkis iterableis mappingis sequenceis string
a_dict (e.g. {})βœ”οΈβœ”οΈβœ”οΈβŒ
a_list (e.g. [])βœ”οΈβŒβœ”οΈβŒ
a_string (e.g. “”)βœ”οΈβœ”οΈβœ”οΈβœ”οΈ
A comparison between value types

So, if you were to check for is iterable first, you might match on a_list or a_dict instead of a_string, but string can only match on a_string. Once you know it can’t be a string, you can check whether something is mapping – again, because a mapping can match either a_string or a_dict, but it can’t match a_list. Once you know it’s not that, you can check for either is iterable or is sequence because both of these match a_string, a_dict and a_list.

Likewise, if you wanted to check whether a_float and an_integer is number and not is string, you can check these:

⬇️Type \ ➑️Checkis floatis integeris iterableis mappingis numberis sequenceis string
a_floatβœ”οΈβŒβŒβŒβœ”οΈβŒβŒ
an_integerβŒβœ”οΈβŒβŒβœ”οΈβŒβŒ
A comparison between types of numbers

So again, a_float and an_integer don’t match is string, is mapping or is iterable, but they both match is number and they each match their respective is float and is integer checks.

How about each of those (a_float and an_integer) wrapped in quotes, making them a string? What happens then?

⬇️Type \ ➑️Checkis floatis integeris iterableis mappingis numberis sequenceis string
a_float_as_stringβŒβŒβœ”οΈβŒβŒβœ”οΈβœ”οΈ
an_integer_as_stringβŒβŒβœ”οΈβŒβŒβœ”οΈβœ”οΈ
A comparison between types of numbers when held as a string

This is somewhat interesting, because they look like a number, but they’re actually “just” a string. So, now you need to do some comparisons to make them look like numbers again to check if they’re numbers.

Changing the type of a string

What happens if you cast the values? Casting means to convert from one type of value (e.g. string) into another (e.g. float) and to do that, Ansible has three filters we can use, float, int and string. You can’t cast to a dict or a list, but you can use dict2items and items2dict (more on those later). So let’s start with casting our group of a_ and an_ items from above. Here’s a list of values I want to use:

---
- hosts: localhost
  gather_facts: no
  vars:
    an_int: 1
    a_float: 1.1
    a_string: "string"
    an_int_as_string: "1"
    a_float_as_string: "1.1"
    a_list:
      - item1
    a_dict:
      key1: value1

With each of these values, I returned the value as Ansible knows it, what happens when you do {{ value | float }} to cast it as a float, as an integer by doing {{ value | int }} and as a string {{ value | string }}. Some of these results are interesting. Note that where you see u'some value' means that Python converted that string to a Unicode string.

⬇️Value \ ➑️Castvaluevalue when cast as floatvalue when cast as integervalue when cast as string
a_dict{“key1”: “value1”}0.00“{u’key1′: u’value1′}”
a_float1.11.11“1.1”
a_float_as_string“1.1”1.11“1.1”
a_list[“item1”]0.00“[u’item1′]”
a_string“string”0.00“string”
an_int111“1”
an_int_as_string“1”11“1”
Casting between value types

So, what does this mean for us? Well, not a great deal, aside from to note that you can “force” a number to be a string, or a string which is “just” a number wrapped in quotes can be forced into being a number again.

Oh, and casting dicts to lists and back again? This one is actually pretty clearly documented in the current set of documentation (as at 2.9 at least!)

Checking for miscast values

How about if I want to know whether a value I think might be a float stored as a string, how can I check that?

{{ vars[var] | float | string == vars[var] | string }}

What is this? If I cast a value that I think might be a float, to a float, and then turn both the cast value and the original into a string, do they match? If I’ve got a string or an integer, then I’ll get a false, but if I have actually got a float, then I’ll get true. Likewise for casting an integer. Let’s see what that table looks like:

⬇️Type \ ➑️Checkvalue when cast as floatvalue when cast as integervalue when cast as string
a_floatβœ”οΈβŒβœ”οΈ
a_float_as_stringβœ”οΈβŒβœ”οΈ
an_integerβŒβœ”οΈβœ”οΈ
an_integer_as_stringβŒβœ”οΈβœ”οΈ
A comparison between types of numbers when cast to a string

So this shows us the values we were after – even if you’ve got a float (or an integer) stored as a string, by doing some careful casting, you can confirm they’re of the type you wanted… and then you can pass them through the right filter to use them in your playbooks!

Booleans

Last thing to check – boolean values – “True” or “False“. There’s a bit of confusion here, as a “boolean” can be: true or false, yes or no, 1 or 0, however, is true and True and TRUE the same? How about false, False and FALSE? Let’s take a look!

⬇️Value \ ➑️Checktype_debug is booleanis numberis iterableis mappingis stringvalue when cast as boolvalue when cast as stringvalue when cast as integer
yesboolβœ”οΈβœ”οΈβŒβŒβŒTrueTrue1
YesAnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈFalseYes0
YESAnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈFalseYES0
“yes”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈTrueyes0
“Yes”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈTrueYes0
“YES”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈTrueYES0
trueboolβœ”οΈβœ”οΈβŒβŒβŒTrueTrue1
Trueboolβœ”οΈβœ”οΈβŒβŒβŒTrueTrue1
TRUEboolβœ”οΈβœ”οΈβŒβŒβŒTrueTrue1
“true”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈTruetrue0
“True”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈTrueTrue0
“TRUE”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈTrueTRUE0
1intβŒβœ”οΈβŒβŒβŒTrue11
“1”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈTrue11
noboolβœ”οΈβœ”οΈβŒβŒβŒFalseFalse0
Noboolβœ”οΈβœ”οΈβŒβŒβŒFalseFalse0
NOboolβœ”οΈβœ”οΈβŒβŒβŒFalseFalse0
“no”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈFalseno0
“No”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈFalseNo0
“NO”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈFalseNO0
falseboolβœ”οΈβœ”οΈβŒβŒβŒFalseFalse0
Falseboolβœ”οΈβœ”οΈβŒβŒβŒFalseFalse0
FALSEboolβœ”οΈβœ”οΈβŒβŒβŒFalseFalse0
“false”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈFalsefalse0
“False”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈFalseFalse0
“FALSE”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈFalseFALSE0
0intβŒβœ”οΈβŒβŒβŒFalse00
“0”AnsibleUnicodeβŒβŒβœ”οΈβŒβœ”οΈFalse00
Comparisons between various stylings of boolean representations

So, the stand out thing for me here is that while all the permutations of string values of the boolean representations (those wrapped in quotes, like this: "yes") are treated as strings, and shouldn’t be considered as “boolean” (unless you cast for it explicitly!), and all non-string versions of true, false, and no are considered to be boolean, yes, Yes and YES are treated differently, depending on case. So, what would I do?

In summary

  • Consistently use no or yes, true or false in lower case to indicate a boolean value. Don’t use 1 or 0 unless you have to.
  • If you’re checking that you’re working with a string, a list or a dict, check in the order string (using is string), dict (using is mapping) and then list (using is sequence or is iterable)
  • Checking for numbers that are stored as strings? Cast your string through the type check for that number, like this: {% if value | float | string == value | string %}{{ value | float }}{% elif value | int | string == value | string %}{{ value | int }}{% else %}{{ value }}{% endif %}
  • Try not to use type_debug unless you really can’t find any other way. These values will change between versions, and this caused me a lot of issues with a large codebase I was working on a while ago!

Run these tests yourself!

Want to run these tests yourself? Here’s the code I ran (also available in a Gist on GitHub), using Ansible 2.9.10.

---
- hosts: localhost
  gather_facts: no
  vars:
    an_int: 1
    a_float: 1.1
    a_string: "string"
    an_int_as_string: "1"
    a_float_as_string: "1.1"
    a_list:
      - item1
    a_dict:
      key1: value1
  tasks:
    - debug:
        msg: |
          {
          {% for var in ["an_int", "an_int_as_string","a_float", "a_float_as_string","a_string","a_list","a_dict"] %}
            "{{ var }}": {
              "type_debug": "{{ vars[var] | type_debug }}",
              "value": "{{ vars[var] }}",
              "is float": "{{ vars[var] is float }}",
              "is integer": "{{ vars[var] is integer }}",
              "is iterable": "{{ vars[var] is iterable }}",
              "is mapping": "{{ vars[var] is mapping }}",
              "is number": "{{ vars[var] is number }}",
              "is sequence": "{{ vars[var] is sequence }}",
              "is string": "{{ vars[var] is string }}",
              "value cast as float": "{{ vars[var] | float }}",
              "value cast as integer": "{{ vars[var] | int }}",
              "value cast as string": "{{ vars[var] | string }}",
              "is same when cast to float": "{{ vars[var] | float | string == vars[var] | string }}",
              "is same when cast to integer": "{{ vars[var] | int | string == vars[var] | string }}",
              "is same when cast to string": "{{ vars[var] | string == vars[var] | string }}",
            },
          {% endfor %}
          }
---
- hosts: localhost
  gather_facts: false
  vars:
    # true, True, TRUE, "true", "True", "TRUE"
    a_true: true
    a_true_initial_caps: True
    a_true_caps: TRUE
    a_string_true: "true"
    a_string_true_initial_caps: "True"
    a_string_true_caps: "TRUE"
    # yes, Yes, YES, "yes", "Yes", "YES"
    a_yes: yes
    a_yes_initial_caps: Tes
    a_yes_caps: TES
    a_string_yes: "yes"
    a_string_yes_initial_caps: "Yes"
    a_string_yes_caps: "Yes"
    # 1, "1"
    a_1: 1
    a_string_1: "1"
    # false, False, FALSE, "false", "False", "FALSE"
    a_false: false
    a_false_initial_caps: False
    a_false_caps: FALSE
    a_string_false: "false"
    a_string_false_initial_caps: "False"
    a_string_false_caps: "FALSE"
    # no, No, NO, "no", "No", "NO"
    a_no: no
    a_no_initial_caps: No
    a_no_caps: NO
    a_string_no: "no"
    a_string_no_initial_caps: "No"
    a_string_no_caps: "NO"
    # 0, "0"
    a_0: 0
    a_string_0: "0"
  tasks:
    - debug:
        msg: |
          {
          {% for var in ["a_true","a_true_initial_caps","a_true_caps","a_string_true","a_string_true_initial_caps","a_string_true_caps","a_yes","a_yes_initial_caps","a_yes_caps","a_string_yes","a_string_yes_initial_caps","a_string_yes_caps","a_1","a_string_1","a_false","a_false_initial_caps","a_false_caps","a_string_false","a_string_false_initial_caps","a_string_false_caps","a_no","a_no_initial_caps","a_no_caps","a_string_no","a_string_no_initial_caps","a_string_no_caps","a_0","a_string_0"] %}
            "{{ var }}": {
              "type_debug": "{{ vars[var] | type_debug }}",
              "value": "{{ vars[var] }}",
              "is float": "{{ vars[var] is float }}",
              "is integer": "{{ vars[var] is integer }}",
              "is iterable": "{{ vars[var] is iterable }}",
              "is mapping": "{{ vars[var] is mapping }}",
              "is number": "{{ vars[var] is number }}",
              "is sequence": "{{ vars[var] is sequence }}",
              "is string": "{{ vars[var] is string }}",
              "is bool": "{{ vars[var] is boolean }}",
              "value cast as float": "{{ vars[var] | float }}",
              "value cast as integer": "{{ vars[var] | int }}",
              "value cast as string": "{{ vars[var] | string }}",
              "value cast as bool": "{{ vars[var] | bool }}",
              "is same when cast to float": "{{ vars[var] | float | string == vars[var] | string }}",
              "is same when cast to integer": "{{ vars[var] | int | string == vars[var] | string }}",
              "is same when cast to string": "{{ vars[var] | string == vars[var] | string }}",
              "is same when cast to bool": "{{ vars[var] | bool | string == vars[var] | string }}",
            },
          {% endfor %}
          }

Featured image is β€œKelvin Test” by β€œEelke” on Flickr and is released under a CC-BY license.

"Monitors" by "Jaysin Trevino" on Flickr

Using monit to monitor Docker Containers

As I only run a few machines with services that matter on them (notably, my home server and my web server), I don’t need a full-on monitoring service, so instead rely on a system called monit.

Monit is an open source piece of software, used to monitor (see, it’s easily named πŸ˜„) and, if possible remediate issues with things it sees wrong.

I use this for watching whether particular services are running (and if not, restart them), for whether the ink in my printer is empty, and to monitor the free space and SMART status on my disks.

Today I noticed that a Docker container had stopped, and I’d not noticed. It wasn’t a big thing, but it gnawed at me, so I had a bit of a look around to see what I can find about this.

I found this blog post, titled “Monitoring Docker Containers with Monit”, from 2014, which suggested monitoring the result from docker top… and would you believe it, that’s a valid trick πŸ™‚

So, here’s what I’m doing! Each container has it’s own file called/etc/monit/scripts/check_container_<container-name>.sh which has just this command in it:

#! /bin/bash
docker top "<container-name>"
exit $?

Note that you replace <container-name> in both the filename and the script itself with the name of the container – for example, the container hello-world would be monitored with the file check_container_hello-world.sh, and the line in that file would say docker top "hello-world".

I then have a file in /etc/monit/conf.d/ called check_container_<container-name> which has this content

CHECK PROGRAM <container-name> WITH PATH /etc/monit/scripts/check_container_<container-name>.sh
  START PROGRAM = "/usr/bin/docker start <container-name>"
  STOP PROGRAM = "/usr/bin/docker stop <container-name>"
  IF status != 0 FOR 3 CYCLES THEN RESTART
  IF 2 RESTARTS WITHIN 5 CYCLES THEN UNMONITOR

I then ensure that in /etc/monit/monitrc the line “include /etc/monit/conf.d/*” is included and not commented out, and then restart monit with systemctl restart monit.

Featured image is β€œMonitors” by β€œJaysin Trevino” on Flickr and is released under a CC-BY license.

"Field Notes - Sweet Tooth" by "The Marmot" on Flickr

Multi-OS builds in AWS with Terraform – some notes from the field!

Late edit: 2020-05-22 – Updated with better search criteria from colleague conversations

I’m building a proof of concept for … well, a product that needs testing on several different Linux and Windows variants on AWS and Azure. I’m building this environment with Terraform, and it’s thrown me a few curve balls, so I thought I’d document the issues I’ve had!

The versions of distributions I have tested are the latest releases of each of these images at-or-near the time of writing. The major version listed is the earliest I have tested, so no assumption is made about previous versions, and later versions, after the time of this post should not assume any of this data is also accurate!

(Fujitsu Staff – please contact me on my work email address for details on how to get the internal AMIs of our builds of these images πŸ˜„)

Linux Distributions

On the whole, I tend to be much more confident and knowledgable about Linux distributions. I’ve also done far more installs of each of these!

Almost all of these installs are Free of Charge, with the exception of Red Hat Enterprise Linux, which requires a subscription fee, and this can be “Pay As You Go” or “Bring Your Own License”. These sorts of things are arranged for me, so I don’t know how easy or hard it is to organise these licenses!

These builds all use cloud-init, via either a cloud-init yaml script, or some shell scripting language (usually accepted to be bash). If this script fails to execute, you will find your user-data file in /var/lib/cloud/instance/scripts/part-001. If this is a shell script then you will be able to execute it by running that script as your root user.

Amazon Linux 2 or Amzn2

Amazon Linux2 is the “preferred” distribution for Amazon Web Services (AWS) (surprisingly enough). It is based on Red Hat Enterprise Linux (RHEL), and many of the instructions you’ll want to run to install software will use RHEL based instructions. This platform is not available outside the AWS ecosystem, as far as I can tell, although you might be able to run it on-prem.

Software packages are limited in this distribution, so any “extra” features require the installation of the “EPEL” repository, by executing the command sudo amazon-linux-extras install epel and then using the yum command to install further packages. I needed nginx for part of my build, and this was only in EPEL.

Amzn2 AMI Lookup

data "aws_ami" "amzn2" {
  most_recent = true

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-2.0.*-gp2"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["amazon"] # Canonical
}

Amzn2 User Account

Amazon Linux 2 images under AWS have a default “ec2-user” user account. sudo will allow escalation to Root with no password prompt.

Amzn2 AWS Interface Configuration

The primary interface is called eth0. Network Manager is not installed. To manage the interface, you need to edit /etc/sysconfig/network-scripts/ifcfg-eth0 and apply changes with ifdown eth0 ; ifup eth0.

Amzn2 user-data / Cloud-Init Troubleshooting

I’ve found the output from user-data scripts appearing in /var/log/cloud-init-output.log.

CentOS 7

For starters, AWS doesn’t have an official CentOS8 image, so I’m a bit stymied there! In fact, as far as I can make out, CentOS is only releasing ISOs for builds now, and not any cloud images. There’s an open issue on their bug tracker which seems to suggest that it’s not going to get any priority any time soon! Blimey.

This image may require you to “subscribe” to the image (particularly if you have a “private marketplace”), but this will be requested of you (via a URL provided on screen) when you provision your first machine with this AMI.

Like with Amzn2, CentOS7 does not have nginx installed, and like Amzn2, installation of the EPEL library is not a difficult task. CentOS7 bundles a file to install the EPEL, installed by running yum install epel-release. After this is installed, you have the “full” range of software in EPEL available to you.

CentOS AMI Lookup

data "aws_ami" "centos7" {
  most_recent = true

  filter {
    name   = "name"
    values = ["CentOS Linux 7*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["aws-marketplace"]
}

CentOS User Account

CentOS7 images under AWS have a default “centos” user account. sudo will allow escalation to Root with no password prompt.

CentOS AWS Interface Configuration

The primary interface is called eth0. Network Manager is not installed. To manage the interface, you need to edit /etc/sysconfig/network-scripts/ifcfg-eth0 and apply changes with ifdown eth0 ; ifup eth0.

CentOS Cloud-Init Troubleshooting

I’ve run several different user-data located bash scripts against this system, and the logs from these scripts are appearing in the default syslog file (/var/log/syslog) or by running journalctl -xefu cloud-init. They do not appear in /var/log/cloud-init-output.log.

Red Hat Enterprise Linux (RHEL) 7 and 8

Red Hat has both RHEL7 and RHEL8 images in the AWS market place. The Proof Of Value (POV) I was building was only looking at RHEL7, so I didn’t extensively test RHEL8.

Like Amzn2 and CentOS7, RHEL7 needs EPEL installing to have additional packages installed. Unlike Amzn2 and CentOS7, you need to obtain the EPEL package from the Fedora Project. Do this by executing these two commands:

wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum install epel-release-latest-7.noarch.rpm

After this is installed, you’ll have access to the broader range of software that you’re likely to require. Again, I needed nginx, and this was not available to me with the stock install.

RHEL7 AMI Lookup

data "aws_ami" "rhel7" {
  most_recent = true

  filter {
    name   = "name"
    values = ["RHEL-7*GA*Hourly*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["309956199498"] # Red Hat
}

RHEL8 AMI Lookup

data "aws_ami" "rhel8" {
  most_recent = true

  filter {
    name   = "name"
    values = ["RHEL-8*HVM-*Hourly*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["309956199498"] # Red Hat
}

RHEL User Accounts

RHEL7 and RHEL8 images under AWS have a default “ec2-user” user account. sudo will allow escalation to Root with no password prompt.

RHEL AWS Interface Configuration

The primary interface is called eth0. Network Manager is installed, and the eth0 interface has a profile called “System eth0” associated to it.

RHEL Cloud-Init Troubleshooting

In RHEL7, as per CentOS7, logs from user-data scripts are appear in the general syslog file (in this case, /var/log/messages) or by running journalctl -xefu cloud-init. They do not appear in /var/log/cloud-init-output.log.

In RHEL8, logs from user-data scrips now appear in /var/log/cloud-init-output.log.

Ubuntu 18.04

At the time of writing this, the vendor, who’s product I was testing, categorically stated that the newest Ubuntu LTS, Ubuntu 20.04 (Focal Fossa) would not be supported until some time after our testing was complete. As such, I spent no time at all researching or planning to use this image.

Ubuntu is the only non-RPM based distribution in this test, instead being based on the Debian project’s DEB packages. As such, it’s range of packages is much wider. That said, for the project I was working on, I required a later version of nginx than was available in the Ubuntu Repositories, so I had to use the nginx Personal Package Archive (PPA). To do this, I found the official PPA for the nginx project, and followed the instructions there. Generally speaking, this would potentially risk any support from the distribution vendor, as it’s not certified or supported by the project… but I needed that version, so I had to do it!

Ubuntu 18.04 AMI Lookup

data "aws_ami" "ubuntu1804" {
  most_recent = true

  filter {
    name   = "name"
    values = ["*ubuntu*18.04*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["099720109477"] # Canonical
}

Ubuntu 18.04 User Accounts

Ubuntu 18.04 images under AWS have a default “ubuntu” user account. sudo will allow escalation to Root with no password prompt.

Ubuntu 18.04 AWS Interface Configuration

The primary interface is called eth0. Network Manager is not installed, and instead Ubuntu uses Netplan to manage interfaces. The file to manage the interface defaults is /etc/netplan/50-cloud-init.yaml. If you struggle with this method, you may wish to install ifupdown and define your configuration in /etc/network/interfaces.

Ubuntu 18.04 Cloud-Init Troubleshooting

In Ubuntu 18.04, logs from user-data scrips appear in /var/log/cloud-init-output.log.

Windows

This section is far more likely to have it’s data consolidated here!

Windows has a common “standard” username – Administrator, and a common way of creating a password (this is generated on-boot, and the password is transferred to the AWS Metadata Service, which it is retrieved and decrypted with the SSH key you’ve used to build the “authentication” to the box) which Terraform handles quite nicely.

The network device is referred to as “AWS PV Network Device #0”. It can be managed with powershell, netsh (although apparently Microsoft are rumbling about demising this script), or from the GUI.

Windows 2012R2

This version is very old now, and should be compared to Windows 7 in terms of age. It is only supported by Microsoft with an extended maintenance package!

Windows 2012R2 AMI Lookup

data "aws_ami" "w2012r2" {
  most_recent = true

  filter {
    name = "name"
    values = ["Windows_Server-2012-R2_RTM-English-64Bit-Base*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["801119661308"] # AWS
}

Windows 2012R2 Cloud-Init Troubleshooting

Logs from the Metadata Service can be found in C:\Program Files\Amazon\Ec2ConfigService\Logs\Ec2ConfigLog.txt. You can also find the userdata script in C:\Program Files\Amazon\Ec2ConfigService\Scripts\UserScript.ps1. This can be launched and debugged using PowerShell ISE, which is in the “Start” menu.

Windows 2016

This version is reasonably old now, and should be compared to Windows 8 in terms of age. It is supported until 2022 in “mainline” support.

Windows 2016 AMI Lookup

data "aws_ami" "w2016" {
  most_recent = true

  filter {
    name = "name"
    values = ["Windows_Server-2016-English-Full-Base*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["801119661308"] # AWS
}

Windows 2016 Cloud-Init Troubleshooting

The metadata service has moved from Windows 2016 and onwards. Logs are stored in a partially hidden directory tree, so you may need to click in the “Address” bar of the Explorer window and type in part of this path. The path to these files is: C:\ProgramData\Amazon\EC2-Windows\Launch\Log. I say “files” as there are two parts to this file – an “Ec2Launch.log” file which reports on the boot process, and “UserdataExecution.log” which shows the output from the userdata script.

Unlike with the Windows 2012R2 version, you can’t get hold of the actual userdata script on the filesystem, you need to browse to a special path in the metadata service (actually, technically, you can do this with any of the metadata services – OpenStack, Azure, and so on) which is: http://169.254.169.254/latest/user-data/

This will contain userdata between a <powershell> and </powershell> pair of tags. This would need to be copied out of this URL and pasted into a new file on your local machine to determine why issues are occurring. Again, I would recommend using PowerShell ISE from the Start Menu to debug your code.

Windows 2019

This version is the most recent released version of Windows Server, and should be compared to Windows 10 in terms of age.

Windows 2019 AMI Lookup

data "aws_ami" "w2019" {
  most_recent = true

  filter {
    name = "name"
    values = ["Windows_Server-2019-English-Full-Base*"]
  }

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }

  owners = ["801119661308"] # AWS
}

Windows 2019 Cloud-Init Troubleshooting

Functionally, the same as Windows 2016, but to recap, the metadata service has moved from Windows 2016 and onwards. Logs are stored in a partially hidden directory tree, so you may need to click in the “Address” bar of the Explorer window and type in part of this path. The path to these files is: C:\ProgramData\Amazon\EC2-Windows\Launch\Log. I say “files” as there are two parts to this file – an “Ec2Launch.log” file which reports on the boot process, and “UserdataExecution.log” which shows the output from the userdata script.

Unlike with the Windows 2012R2 version, you can’t get hold of the actual userdata script on the filesystem, you need to browse to a special path in the metadata service (actually, technically, you can do this with any of the metadata services – OpenStack, Azure, and so on) which is: http://169.254.169.254/latest/user-data/

This will contain userdata between a <powershell> and </powershell> pair of tags. This would need to be copied out of this URL and pasted into a new file on your local machine to determine why issues are occurring. Again, I would recommend using PowerShell ISE from the Start Menu to debug your code.

Featured image is β€œField Notes – Sweet Tooth” by β€œThe Marmot” on Flickr and is released under a CC-BY license.

β€œNew shoes” by β€œMorgaine” from Flickr

Making Windows Cloud-Init Scripts run after a reboot (Using Terraform)

I’m currently building a Proof Of Value (POV) environment for a product, and one of the things I needed in my environment was an Active Directory domain.

To do this in AWS, I had to do the following steps:

  1. Build my Domain Controller
    1. Install Windows
    2. Set the hostname (Reboot)
    3. Promote the machine to being a Domain Controller (Reboot)
    4. Create a domain user
  2. Build my Member Server
    1. Install Windows
    2. Set the hostname (Reboot)
    3. Set the DNS client to point to the Domain Controller
    4. Join the server to the domain (Reboot)

To make this work, I had to find a way to trigger build steps after each reboot. I was working with Windows 2012R2, Windows 2016 and Windows 2019, so the solution had to be cross-version. Fortunately I found this script online! That version was great for Windows 2012R2, but didn’t cover Windows 2016 or later… So let’s break down what I’ve done!

In your userdata field, you need to have two sets of XML strings, as follows:

<persist>true</persist>
<powershell>
$some = "powershell code"
</powershell>

The first block says to Windows 2016+ “keep trying to run this script on each boot” (note that you need to stop it from doing non-relevant stuff on each boot – we’ll get to that in a second!), and the second bit is the PowerShell commands you want it to run. The rest of this now will focus just on the PowerShell block.

  $path= 'HKLM:\Software\UserData'
  
  if(!(Get-Item $Path -ErrorAction SilentlyContinue)) {
    New-Item $Path
    New-ItemProperty -Path $Path -Name RunCount -Value 0 -PropertyType dword
  }
  
  $runCount = Get-ItemProperty -Path $path -Name Runcount -ErrorAction SilentlyContinue | Select-Object -ExpandProperty RunCount
  
  if($runCount -ge 0) {
    switch($runCount) {
      0 {
        $runCount = 1 + [int]$runCount
        Set-ItemProperty -Path $Path -Name RunCount -Value $runCount
        if ($ver -match 2012) {
          #Enable user data
          $EC2SettingsFile = "$env:ProgramFiles\Amazon\Ec2ConfigService\Settings\Config.xml"
          $xml = [xml](Get-Content $EC2SettingsFile)
          $xmlElement = $xml.get_DocumentElement()
          $xmlElementToModify = $xmlElement.Plugins
          
          foreach ($element in $xmlElementToModify.Plugin)
          {
            if ($element.name -eq "Ec2HandleUserData") {
              $element.State="Enabled"
            }
          }
          $xml.Save($EC2SettingsFile)
        }
        $some = "PowerShell Script"
      }
    }
  }

Whew, what a block! Well, again, we can split this up into a couple of bits.

In the first few lines, we build a pointer, a note which says “We got up to here on our previous boots”. We then read that into a variable and find that number and execute any steps in the block with that number. That’s this block:

  $path= 'HKLM:\Software\UserData'
  
  if(!(Get-Item $Path -ErrorAction SilentlyContinue)) {
    New-Item $Path
    New-ItemProperty -Path $Path -Name RunCount -Value 0 -PropertyType dword
  }
  
  $runCount = Get-ItemProperty -Path $path -Name Runcount -ErrorAction SilentlyContinue | Select-Object -ExpandProperty RunCount
  
  if($runCount -ge 0) {
    switch($runCount) {

    }
  }

The next part (and you’ll repeat it for each “number” of reboot steps you need to perform) says “increment the number” then “If this is Windows 2012, remind the userdata handler that the script needs to be run again next boot”. That’s this block:

      0 {
        $runCount = 1 + [int]$runCount
        Set-ItemProperty -Path $Path -Name RunCount -Value $runCount
        if ($ver -match 2012) {
          #Enable user data
          $EC2SettingsFile = "$env:ProgramFiles\Amazon\Ec2ConfigService\Settings\Config.xml"
          $xml = [xml](Get-Content $EC2SettingsFile)
          $xmlElement = $xml.get_DocumentElement()
          $xmlElementToModify = $xmlElement.Plugins
          
          foreach ($element in $xmlElementToModify.Plugin)
          {
            if ($element.name -eq "Ec2HandleUserData") {
              $element.State="Enabled"
            }
          }
          $xml.Save($EC2SettingsFile)
        }
        
      }

In fact, it’s fair to say that in my userdata script, this looks like this:

  $path= 'HKLM:\Software\UserData'
  
  if(!(Get-Item $Path -ErrorAction SilentlyContinue)) {
    New-Item $Path
    New-ItemProperty -Path $Path -Name RunCount -Value 0 -PropertyType dword
  }
  
  $runCount = Get-ItemProperty -Path $path -Name Runcount -ErrorAction SilentlyContinue | Select-Object -ExpandProperty RunCount
  
  if($runCount -ge 0) {
    switch($runCount) {
      0 {
        ${file("templates/step.tmpl")}

        ${templatefile(
          "templates/rename_windows.tmpl",
          {
            hostname = "SomeMachine"
          }
        )}
      }
      1 {
        ${file("templates/step.tmpl")}

        ${templatefile(
          "templates/join_ad.tmpl",
          {
            dns_ipv4 = "192.0.2.1",
            domain_suffix = "ad.mycorp",
            join_account = "ad\someuser",
            join_password = "SomePassw0rd!"
          }
        )}
      }
    }
  }

Then, after each reboot, you need a new block. I have a block to change the computer name, a block to join the machine to the domain, and a block to install an software that I need.

Featured image is β€œNew shoes” by β€œMorgaine” on Flickr and is released under a CC-BY-SA license.

"Fishing line and bobbin stuck on tree at Douthat State Park" by "Virginia State Parks" on Flickr

Note to self: Linux shell scripts don’t cope well with combined CRLF + LF files… Especially in User-Data / Custom Data / Cloud-Init scripts

This one is more a nudge to myself. On several occasions when building Infrastructure As Code (IAC), I split out a code sections into one or more files, for readability and reusability purposes. What I tended to do, and this was more apparent with the Linux builds than the Windows builds, was to forget to set the line terminator from CRLF to LF.

While this doesn’t really impact Windows builds too much (they’re kinda designed to support people being idiots with line endings now), Linux still really struggles with CRLF endings, and you’ll only see when you’ve broken this because you’ll completely fail to run any of the user-data script.

How do you determine this is your problem? Well, actually it’s a bit tricky, as neither cat, less, more or nano spot this issue. The only two things I found that identified it were file and vi.

The first part of the combined file with mixed line endings. This part has LF termination.
The second part of the combined file with mixed line endings. This part has CRLF termination.
What happens when we cat these two parts into one file? A file with CRLF, LF line terminators obviously!
What the combined file looks like in Vi. Note the blue ^M at the ends of the lines.

So, how to fix this? Assuming you’re using Visual Studio Code;

A failed line-ending clue in Visual Studio Code

You’ll notice this line showing “CRLF” in the status bar at the bottom of Code. Click on that, which brings up a discrete box near the top, as follows:

Oh no, it’s set to “CRLF”. That’s not what we want!

Selecting LF in that box changes the line feeds into LF for this file, but it’s not saved. Make sure you save this file before you re-run your terraform script!

Notice, we’re now using LF endings, but the file isn’t saved.

Fantastic! It’s all worked!

In Nano, I’ve opened the part with the invalid line endings.

Oh no! We have a “DOS Format” file. Quick, let’s fix it!

To fix this, we need to write the file out. Hit Ctrl+O. This tells us that we’re in DOS Format, and also gives us the keyboard combination to toggle “DOS Format” off – it’s Alt+D (In Unix/Linux world, the Alt key is referred to as the Meta key – hence M not A).

This is how we fix things

So, after hitting Alt+D, the “File Name to write” line changes, see below:

Yey, no pesky “DOS Format” warning here!

Using either editor (or any others, if you know how to solve line ending issues in other editors), you still need to combine your script back together before you can run it, so… do that, and your file will be fine to run! Good luck!

Featured image is β€œFishing line and bobbin stuck on tree at Douthat State Park” by β€œVirginia State Parks” on Flickr and is released under a CC-BY license.

"Root" by "llee_wu" on Flickr

A quick note on using #Firefox in #Windows in a #Corporate or #Enterprise environment

I’ve been using Firefox as my “browser of choice” for around 15 years. I tend to prefer to use it for all sorts of reasons, but the main thing I expect is support for extensions. Not many of them, but … well, there’s a few!

There are two stumbling blocks for using Firefox in a corporate or enterprise setting. These are:

  1. NTLM or Kerberos Authentication for resources like Sharepoint and ADFS.
  2. Enterprise TLS certificates (usually deployed via GPO as part of the domain)

These are both trivially fixed in the about:config screen, but first you need to get past a scary looking warning page!

In the address bar, where it probably currently says jon.sprig.gs, click in there and type about:config.

Getting to about:config

This brings you to a scary page!

Proceed with caution! (of course!!)

Click the “Accept the Risk and Continue” (note, this is with Firefox 76. Wording with later or earlier versions may differ).

As if it wasn’t obvious enough from the previous screen, this “may impact performance and security”…

And then you get a search box.

In the “Search preference name” type in ntlm and find the line that says network.automatic-ntlm-auth.trusted-uris.

The “NTLM Options page”

Type in there the suffixes of any TRUSTED domains. For example, if your company uses the domain names of bigcompany.com, bigco.local and big.company then you’d type in:

bigcompany.com,bigco.local,big.company

Any pages that you browse to, where they request NTLM authentication, will receive an NTLM set of credentials if prompted (same as IE, Edge, and Chrome already do!) NTLM is effectively a way to pass a trusted Kerberos ticket (a bit like your domain credentials) into a web page.

Next up, let’s get those pesky certificate errors removed!

This assumes that you have a centrally managed TLS Root Certificate, and the admins in your network haven’t just been dumping self signed certificates everywhere (nothing gets around that… just sayin’).

Still in about:config, clear the search box and type enterprise, like this!

Enterprise Roots are here!

Find the line security.enterprise_roots.enabled and make sure it says true. If it doesn’t double click it, so it does.

Now you can close your preferences page, and you should be fine to visit your internal source code repository, time sheeting system or sharepoint site, with almost no interruptions!

If you’ve been tasked for turning this stuff on in your estate of managed desktop environment machines, then you might find this article (on Autoconfiguration of Firefox) of use (but I’ve not tried it!)

Featured image is “Root” by “llee_wu” on Flickr and is released under a CC-BY-ND license.

"Debian" by "medithIT" on Flickr

One to read: Installing #Debian on #QNAP TS-219P

A couple of years ago, a very lovely co-worked gave me his QNAP TS-219P which he no longer required. I’ve had it sitting around storing data, but not making the most use of it since he gave it to me.

After a bit of tinkering in my home network, I decided that I needed a more up-to-date OS on this device, so I found this page that tells you how to install Debian Buster. This will wipe the device, so make sure you’ve got a full backup of your content!

https://www.cyrius.com/debian/kirkwood/qnap/ts-219/install/

Essentially, you backup the existing firmware with these commands:

cat /dev/mtdblock0 > mtd0
cat /dev/mtdblock1 > mtd1
cat /dev/mtdblock2 > mtd2
cat /dev/mtdblock3 > mtd3
cat /dev/mtdblock4 > mtd4
cat /dev/mtdblock5 > mtd5

These files need to be transferred off, in “case of emergency”, then download the installation files:

mkdir /tmp/debian
cd /tmp/debian
busybox wget http://ftp.debian.org/debian/dists/buster/main/installer-armel/current/images/kirkwood/network-console/qnap/ts-21x/{initrd,kernel-6281,kernel-6282,flash-debian,model}
sh flash-debian
reboot

The flash-debian command takes around 3-5 minutes, apparently, although I did start the job and walk away, so it might have taken anywhere up to 30 minutes!

Then, SSH to the IP of the device, and use the credentials installer and install as username and password.

Complete the installation steps in the SSH session, then let it reboot.

Be aware that the device is likely to have at least one swap volume it will try to load, so it might be worth opening a shell and running the command “swapoff -a” before removing the swap partitions. It’s also worth removing all the partitions and then rebooting and starting again if you have any problems with the partitioning.

When it comes back, you have a base installation of Debian, which doesn’t have sudo installed, so use su - and put in the root password.

Good luck!