Work Stuff – Page 5 – A nice guy's view on life

Trying out Kubernetes (K8S) with MicroK8S in Vagrant

2020-01-152020-01-15 JonTheNiceGuy Leave a comment

I’m going on a bit of a containers kick at the moment, and just recently I wanted to give Kubernetes (sometimes abbreviated to “K8S”) a try.

Kubernetes is an orchestration engine for Containers, like Docker. It’s designed to take the images that Docker (and other similar tools) produce, and run them across multiple nodes. You need to have a handle on how Docker works before giving K8S a try, but once you do, it’s well worth a shot to understand K8S.

Unlike Docker, K8S is a bit more in-depth on it’s requirements, and often people are pointed at Minikube as their introduction to K8S, however, my colleague and friend Nick suggested I might be better off with MicroK8S.

MicroK8S is an application released by Canonical as a Snap. A Snap is a Linux packaging format, similar to FlatPak and AppImage. It’s mostly used on Ubuntu based operating systems, but can also work on other Linux distributions.

I had an initial, failed, punt with the recommended advice for using MicroK8S on Windows (short story, Hyper-V did not work for me, and the VirtualBox back-end doesn’t expose any network ports, or at least, if it does, I couldn’t see how to make it work), and as I’m reasonably confident in using Vagrant work in Windows, I built a Vagrantfile to deliver MicroK8S.

To use this, you need Vagrant and VirtualBox, and then get the Vagrantfile from repo… then run vagrant up (it will ask you what interface you want to “bridge” to – this will be how you access the Kubernetes pods and Docker containers). Once the machine has finished building, you can run vagrant ssh to connect into it. From here, you can run your kubectl commands, as well as docker commands.

If you want to experiment with a multi-node environment, then I also built a Vagrantfile to deliver two virtual machines, both running MicroK8S, and used the shared storage element of Vagrant to transfer the “join” instruction from the first node to the second.

Of course, now I just need to work out how the hell I do Kubernetes 🤣

Featured image is “Captain” by “The Laddie” on Flickr and is released under a CC-BY-ND license.

nobodys perfect nbc GIF by The Good Place from Giphy

Talk Summary – FDE Conference “Automation in an Infrastructure as Code World”

2019-10-092019-10-14 JonTheNiceGuy 1 Comment

Format: Theatre Style room. ~70 attendees.

Slides: Available to view (Firefox/Chrome recommended – press “S” to see the required speaker notes), Code referenced in the slides also available to view.

Video: Not on the day, but I recorded a take of it at home after the event. The delivery on the day was better, but the content is there at least! :)

Slot: Slot 2 Wednesday 14:15-15:00

Notes: FDE is the abbreviation of “Fujitsu Distinguished Engineer”, an internal program at Fujitsu. Each year they hold a conference for all the FDEs to attend. This is my second year as an FDE, and the first where I’m presenting.

This slide deck was massively re-worked, following some excellent feedback at BCMcr9. I then, unusually for me, gave the deck two separate run through sessions with colleagues, and tweaked it following each run.

This deck includes Creative Commons licensed images (which is fairly common for my slide decks), but also, in a new and unusual step for me, includes meme gifs from Giphy. I’m not really sure about whether this is step forward or back for me, as I do prefer permissive licenses. That said, the memes seem to be more engaging – particularly as they’re animated. I’ve never had someone comment on the images in my slide deck until I did the first run through with the memes in with a colleague, and then again when I ran it a second time they particularly brought up the animated images… so the memes are staying for now.

I’m also slightly disappointed with myself that I couldn’t stick to the “One Bold Word” style of presentations (the format preferred by Jono Bacon), and found myself littering more and more content into the screen. I was, however, proud of myself for including the “Tweetable content” slide, as recommended, I think, by Lorna Mitchell (@LornaJane). I also included a “Your next steps” slide, as recommended by Andy Bounds (although I suspect he’d be disappointed with the “Questions?” slide at the end!)

This deck required quite a bit of research on my part. I’d never written CloudFormations (CF) before, and I’d only really copied-and-pasted Terraform (I refer to it as TF which probably isn’t right) before. I wrote a full stack of machines in CF, Azure Resource Manager (ARM) for the native technologies, as well as the same stacks in both TF and Ansible for both Azure and AWS. I also looked into how to deploy the CF and ARM templates with both Terraform and Ansible, and finally how to use TF from Ansible. I already knew how to run Ansible from within userdata/customdata arguments in AWS and Azure, but I included it and tested it as part of the deck too.

I had some amazing feedback from the audience and some great questions asked of me. I loved the response from the audience to some of my GIFs (although one comment that was made was that I need to stop the animations after the first run!)

Following the session, as I’d hoped, it brought a few of the fellow attendees to the forefront to ask if we can talk further about the subject and I would encourage you, if you are someone who uses these tools to give me a shout – I want to do more and find out about your projects, processes and tools!

My intention is to start using this slide deck at meet-ups in the Greater Manchester area, hopefully without having to re-write it that much!

AWS Game Day

2019-09-182019-09-18 JonTheNiceGuy 2 Comments

I was invited, through work, to participate in an AWS tradition – the AWS Game Day. This event was organised by my employer for our internal staff to experience a day in the life of a fully deployed AWS environment… and have some fun with it too. The AWS Game Day is a common scenario, and if you’re lucky enough to join one, you’ll probably be doing this one… As such, there will be… #NoSpoilers.

A Game Day (sometimes disambiguated as an “Adversarial Game Day”, because of sporting events) is a day where you either have a dummy environment, or, if you have the scale, a portion of your live network is removed from live service and used as a training ground. In this case, AWS provided a specific dummy environment “Unicorn.Rentals”, and all the attendees are the new recruits to the DevOps Team… Oh, and all the previous DevOps team members had just left the company… all at once.

Attendees were split into teams of four, and each team had a disparate background.

We’re given access to;

Our login panel. This gives us our score, our trending increase or decrease in score over the last “period” (I think it was 5 minutes), our access to the AWS console, and a panel to update the CNAME for the DNS records.
AWS Console. This is a mostly unrestricted account in AWS. There are some things we don’t get access to – for example, we didn’t get the CloudFormation Template for setting up the game day, and we couldn’t make changes to the IAM environment at all. Oh, and what was particularly frustrating was not being able to … Oh yes, I forgot, #NoSpoilers ;)
A central scoreboard of all the teams
A running tally of how we were scored
- Each web request served under X seconds received one score
- Each request served between X and Y seconds received another score,
- Each request served over Y seconds received a third score.
- Failing to respond to a request received a negative score.
- Infrastructure costs deducted points from the score (to stop you just putting stuff at ALL THE SERVERS, ALL THE TIME).
The outgoing DevOps team’s “runbook”. Not too dissimilar to the sort of documentation you write before you go on leave. “If this thing break, run this or just reboot the box”, “You might see this fail with something like this message if the server can’t keep up with the load”. Enough to give you a pointer on where to look, not quite enough to give you the answer :)

The environment we were working on was, well, relatively simple. An auto-scaling web service, running a simple binary on an EC2 instance behind a load balancer. We extended the reach of services we could use (#NoSpoilers!) to give us greater up-time, improved responsiveness and broader scope of access. We were also able to monitor … um, things :) and change the way we viewed the application.

I don’t want to give too many details, because it will spoil the surprises, but I will say that we learned a lot about the services in AWS we had access to, which wasn’t the full product set (just “basic” AWS IaaS tooling).

When the event finished, everyone I spoke to agreed that having a game day is a really good idea! One person said “You only really learn something when you fix it! This is like being called out, without the actual impact to a customer” and another said “I’ve done more with AWS in this day than I have the past couple of months since I’ve been looking at it.”

And, as you can probably tell, I agree! I’d love to see more games days like this! I can see how running something like this, on technology you use in your customer estate, can be unbelievably powerful – especially if you’ve got a mildly nefarious GM running some background processes to break things (#NoSpoilers). If you can make it time-sensitive too (“you’ve got one day to restore service”, or like in this case, “every minute we’re not selling product, we’re losing points”), then that makes it feel like you’ve been called out, but without the stress of feeling like you’re actually going to lose your job at the end of the day (not that I’ve ever actually felt like that when I’ve been called out!!)

Anyway, massive kudos to our AWS SE team for delivering the training, and a huge cheer of support to Sara for getting the event organised. I look forward to getting invited to a new scenario sometime soon! ;)

Here are some pictures from the event!

The teams get to know each other, and we find out about the day ahead! Picture by @Fujitsu_FDE.

Our team, becoming a team by changing the table layout! It made a difference, we went to the top of the leader board for at least 5 minutes! Picture by @Fujitsu_FDE.

The final scores. Picture by @Fujitsu_FDE

Our lucky attendees got to win some of these items! Picture by @Fujitsu_FDE

“Well Done” (ha, yehr, right!) to the winning team (“FIX!”) “UnicornsRUs”. Picture by @Fujitsu_FDE.

The featured image is “AWS Game Day Attendees” by @Fujitsu_FDE.

“Swatch Water Store, Grand Central Station, NYC, 9/2016, pics by Mike Mozart of TheToyChannel and JeepersMedia on YouTube #Swatch #Watch” by “Mike Mozart” on Flickr

Time Based Security

2019-08-212019-08-21 JonTheNiceGuy Leave a comment

I came across the concept of “Time Based Security” (TBS) in the Sysadministrivia podcast, S4E13.

I’m still digging into the details of it, but in essence, the “Armadillo” (Crunchy on the outside, soft on the inside) protection model is broken (sometimes known as the “Fortress Model”). You assume that your impenetrable network boundary will prevent attackers from getting to your sensitive data. While this may stop them for a while, what you’re actually seeing here is one part of a complex protection system, however many organisations miss the fact that this is just one part.

The examples used in the only online content I’ve found about this refer to a burglary.

In this context, your “Protection” (P) is measured in time. Perhaps you have hardened glass that takes 20 seconds to break.

Next, we evaluate “Detection” (D) which is also, surprisingly enough, measured in time. As the glass is hit, it triggers an alarm to a security facility. That takes 20 seconds to respond and goes to a dispatch centre, another 20 seconds for that to be answered and a police officer dispatched.

The police officer being dispatched is the “Response” (R). The police take (optimistically) 2 minutes to arrive (it was written in the 90’s so the police forces weren’t decimated then).

So, in the TBS system, we say that Detection (D) of 40 seconds plus Response (R) of 120 seconds = 160 seconds. This is greater than Protection (P) of 20 seconds, so we have an Exposure (E) time of 140 seconds E = P – (D + R). The question that is posed is, how much damage can be done in E?

So, compare this to your average pre-automation SOC. Your firewall, SIEM (Security Incident Event Management system), IDS (Intrusion Detection System) or WAF (Web Application Firewall) triggers an alarm. Someone is trying to do something (e.g. Denial Of Service attack, password spraying or port scanning for vulnerable services) a system you’re responsible for. While D might be in the tiny fractions of a minute (perhaps let’s say 1 minute, for maths sake), R is likely to be minutes or even hours, depending on the refresh rate of the ticket management system or alarm system (again, for maths sake, let’s say 60 minutes). So, D+R is now 61 minutes. How long is P really going to hold? Could it be less than 30 minutes against a determined attacker? (Let’s assume P is 30 minutes for maths sake).

Let’s do the calculation for a pre-automation SOC (Security Operations Centre). P-(D+R)=E. E here is 31 minutes. How much damage can an attacker do in 31 minutes? Could they put a backdoor into your system? Can they download sensitive data to a remote system? Could they pivot to your monitoring system, and remove the logs that said they were in there?

If you consider how much smaller the D and R numbers become with an event driven SOAR (Security Orchestration and Automation Response) system – does that improve your P and E numbers? Consider that if you can get E to 0, this could be considered to be “A Secure Environment”.

Also, consider the fact that many of the tools we implement for security reduce D and R, but if you’re not monitoring the outputs of the Detection components, then your response time grows significantly. If your Detection component is misconfigured in that it’s producing too many False Positives (for example, “The Boy Who Cried Wolf“), so you don’t see the real incident, then your Response might only be when a security service notifies you that your data, your service or your money has been exposed and lost. And that wouldn’t be good now… Time to look into automation 😁

Featured image is “Swatch Water Store, Grand Central Station, NYC, 9/2016, pics by Mike Mozart of TheToyChannel and JeepersMedia on YouTube #Swatch #Watch” by “Mike Mozart” on Flickr and is released under a CC-BY license.

Building a simple CA for testing purposes

2019-08-152019-08-21 JonTheNiceGuy Leave a comment

I recently needed to create a Certificate Authority with an Intermediate Certificate to test some TLS inspection stuff at work. This script (based on a document I found at jamielinux.com) builds a Certificate Authority and creates an Intermediate Certificate Authority using the root.

#! /bin/bash # Heavily based on https://jamielinux.com/docs/openssl-certificate-authority/ # Start from # sudo -i mkdir /root/ca cd /root/ca mkdir certs crl newcerts private chmod 700 private touch index.txt echo 1000 > serial wget https://jamielinux.com/docs/openssl-certificate-authority/_downloads/root-config.txt -O openssl.cnf openssl genrsa -aes256 -out private/ca.key.pem 4096 chmod 400 private/ca.key.pem openssl req -config openssl.cnf -key private/ca.key.pem -new -x509 -days 7300 -sha256 -extensions v3_ca -out certs/ca.cert.pem chmod 444 certs/ca.cert.pem openssl x509 -noout -text -in certs/ca.cert.pem mkdir /root/ca/intermediate cd /root/ca/intermediate mkdir certs crl csr newcerts private chmod 700 private touch index.txt echo 1000 > serial echo 1000 > crlnumber wget https://jamielinux.com/docs/openssl-certificate-authority/_downloads/intermediate-config.txt -O openssl.cnf cd /root/ca openssl genrsa -aes256 -out intermediate/private/intermediate.key.pem 4096 chmod 400 intermediate/private/intermediate.key.pem openssl req -config intermediate/openssl.cnf -new -sha256 -key intermediate/private/intermediate.key.pem -out intermediate/csr/intermediate.csr.pem openssl ca -config openssl.cnf -extensions v3_intermediate_ca -days 3650 -notext -md sha256 -in intermediate/csr/intermediate.csr.pem -out intermediate/certs/intermediate.cert.pem chmod 444 intermediate/certs/intermediate.cert.pem openssl x509 -noout -text -in intermediate/certs/intermediate.cert.pem openssl verify -CAfile certs/ca.cert.pem intermediate/certs/intermediate.cert.pem cat intermediate/certs/intermediate.cert.pem certs/ca.cert.pem > intermediate/certs/ca-chain.cert.pem chmod 444 intermediate/certs/ca-chain.cert.pem # https://stackoverflow.com/a/39327439 openssl pkcs8 -topk8 -in intermediate/private/intermediate.key.pem -out intermediate/private/intermediate.key

I’ve also done something similar with Ansible before, but I’ve not got that to hand :)

Late edit, 2019-08-21: Found it! Needs some tweaks to add the sub-CA or child certs, but so-far it would work :)

--- - hosts: localhost vars: - dnsname: your.dns.name - tmppath: "./tmp/" - crtpath: "{{ tmppath }}{{ dnsname }}.crt" - pempath: "{{ tmppath }}{{ dnsname }}.pem" - csrpath: "{{ tmppath }}{{ dnsname }}.csr" - pfxpath: "{{ tmppath }}{{ dnsname }}.pfx" - private_key_password: "password" tasks: - file: path: "{{ tmppath }}" state: absent - file: path: "{{ tmppath }}" state: directory - name: "Generate the private key file to sign the CSR" openssl_privatekey: path: "{{ pempath }}" passphrase: "{{ private_key_password }}" cipher: aes256 - name: "Generate the CSR file signed with the private key" openssl_csr: path: "{{ csrpath }}" privatekey_path: "{{ pempath }}" privatekey_passphrase: "{{ private_key_password }}" common_name: "{{ dnsname }}" - name: "Sign the CSR file as a CA to turn it into a certificate" openssl_certificate: path: "{{ crtpath }}" privatekey_path: "{{ pempath }}" privatekey_passphrase: "{{ private_key_password }}" csr_path: "{{ csrpath }}" provider: selfsigned - name: "Convert the signed certificate into a PKCS12 file with the attached private key" openssl_pkcs12: action: export path: "{{ pfxpath }}" name: "{{ dnsname }}" privatekey_path: "{{ pempath }}" privatekey_passphrase: "{{ private_key_password }}" passphrase: password certificate_path: "{{ crtpath }}" state: present

“code crunching” by “Ruben Molina” on Flickr

Getting Started with Terraform on AWS

2019-08-092019-08-09 JonTheNiceGuy Leave a comment

I recently wrote a blog post about Getting Started with Terraform on Azure. You might have read it (I know I did!).

Having got a VM stood up in Azure, I wanted to build a VM in AWS, after all, it’s more-or-less the same steps. Note, this is a work-in-progress, and shouldn’t be considered “Final” – this is just something to use as *your* starting block.

What do you need?

You need an AWS account for this. If you’ve not got one, signing up for one is easy, but bear in mind that while there are free resource on AWS (only for the first year!), it’s also quite easy to suddenly enable a load of features that cost you money.

Best practice suggests (or rather, INSISTS) you shouldn’t use your “root” account for AWS. It’s literally just there to let you define the rest of your admin accounts. Turn on MFA (Multi-Factor Authentication) on that account, give it an exceedingly complex password, write that on a sheet of paper, and lock it in a box. You should NEVER use it!

Create your admin account, log in to that account. Turn on MFA on *that* account too. Then, create an “Access Token” for your account. This is in IAM (Identity and Access Management). These are what we’ll use to let Terraform perform actions in AWS, without you needing to actually “log in”.

On my machine, I’ve put the credentials for this in /home/<MYUSER>/.aws/credentials and it looks like this:

[default]
aws_access_key_id = ABC123DEF456
aws_secret_access_key = AaBbCcDd1234EeFf56

This file should be chmod 600 and make sure it’s only your account that can access this file. With this token, Terraform can perform *ANY ACTION* as you, including anything that charges you money, or creating servers that can mine a “cryptocurrency” for someone malicious.

I’m using Windows Subsystem for Linux (WSL). I’m using the Ubuntu 18.04 distribution obtained from the Store. This post won’t explain how to get *that*. Also, you might want to run Terraform on Mac, in Windows or on Linux natively… so, yehr.

Next, we need to actually install Terraform. Excuse the long, unwrapped code block, but it gets what you need quickly (assuming the terraform webpage doesn’t change any time soon!)

mkdir -p ~/bin
cd ~/bin
sudo apt update && sudo apt install unzip
curl -sLO $(curl https://www.terraform.io/downloads.html | grep "linux_amd64.zip" | cut -d\" -f 2) && unzip terraform*.zip && rm terraform*.zip && chmod 755 terraform

Starting coding your infrastructure

Before you can build your first virtual machine on AWS, you need to stand up the supporting infrastructure. These are:

An SSH Keypair (no password logins here!)
A VPC (“Virtual Private Cloud”, roughly the same as a VNet on Azure, or somewhat like a L3 switch in the Physical Realm).
An Internet Gateway (if your VPC isn’t classed as “the default one”)
A Subnet.
A Security Group.

Once we’ve got these, we can build our Virtual Machine on EC2 (“Elastic Cloud Compute”), and associate a “Public IP” to it.

To quote my previous post:

One quirk with Terraform, versus other tools like Ansible, is that when you run one of the terraform commands (like terraform init, terraform plan or terraform apply), it reads the entire content of any file suffixed “tf” in that directory, so if you don’t want a file to be loaded, you need to either move it out of the directory, comment it out, or rename it so it doesn’t end .tf. By convention, you normally have three “standard” files in a terraform directory – main.tf, variables.tf and output.tf, but logically speaking, you could have everything in a single file, or each instruction in it’s own file.
Getting Started with Terraform on Azure – Building the file structure

For the sake of editing and annotating the files for this post, these code blocks are all separated, but on my machine, they’re all currently one big file called “main.tf“.

In that file, I start by telling it that I’m working with the Terraform AWS provider, and that it should target my nearest region.

If you want to risk financial ruin, you can put things like your access tokens in here, but I really wouldn’t chance this!

Next, we create our network infrastructure – VPC, Internet Gateway and Subnet. We also change the routing table.

resource "aws_vpc" "main" { cidr_block = "10.0.0.0/16" enable_dns_hostnames = true tags = [ Name = "MainVPC" ] } resource "aws_internet_gateway" "igw" { vpc_id = "${aws_vpc.main.id}" tags = [ Name = "MainInternetGateway" ] } data "aws_route_table" "route" { vpc_id = "${aws_vpc.main.id}" } resource "aws_route" "clouddev_default_out" { route_table_id = "${data.aws_route_table.route.id}" destination_cidr_block = "0.0.0.0/0" gateway_id = "${aws_internet_gateway.igw.id}" } resource "aws_subnet" "main" { vpc_id = "${aws_vpc.main.id}" cidr_block = "10.0.0.0/24" depends_on = ["aws_internet_gateway.igw"] tags = [ Name = "MainSubnet" ] }

I suspect, if I’d created the VPC as “The Default” VPC, then I wouldn’t have needed to amend the routing table, nor added an Internet Gateway. To help us make the routing table change, there’s a “data” block in this section of code. A data block is an instruction to Terraform to go and ask a resource for *something*, in this case, we need AWS to tell Terraform what the routing table is that it created for the VPC. Once we have that we can ask for the routing table change.

AWS doesn’t actually give “proper” names to any of it’s assets. To provide something with a “real” name, you need to tag that thing with the “Name” tag. These can be practically anything, but I’ve given semi-sensible names to everything. You might want to name everything “main” (like I nearly did)!

We’re getting close to being able to create the VM now. First of all, we’ll create the Security Groups. I want to separate out my “Allow Egress Traffic” rule from my “Inbound SSH” rule. This means that I can clearly see what hosts allow inbound SSH access. Like with my Azure post, I’m using a “data provider” to get my public IP address, but in a normal “live” network, you’d specify a collection of valid source address ranges.

data "http" "icanhazip" { url = "http://ipv4.icanhazip.com" } resource "aws_security_group" "my_ssh" { name = "my_ssh" vpc_id = "${aws_vpc.main.id}" ingress { from_port = 22 to_port = 22 protocol = "tcp" cidr_blocks = ["${trimspace(data.http.icanhazip.body)}/32"] } tags = [ Name = "my_ssh" ] } resource "aws_security_group" "all_egress" { name = "all_egress" vpc_id = "${aws_vpc.main.id}" egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } tags = [ Name = "all_egress" ] }

Last steps before we create the Virtual Machine. We need to upload our SSH key, and we need to find the “AMI” (AWS Machine ID) of the image we’ll be using. To create the key, in this directory, along side the .tf files, I’ve put my SSH public key (called id_rsa.pub), and we load that key when we create the “my_key” resource. To find the AMI, we need to make another data call, this time asking the AMI index to find the VM with the name containing ubuntu-bionic-18.04 and some other stuff. AMIs are region specific, so the image I’m using in eu-west-2 will not be the same AMI in eu-west-1 or us-east-1 and so on. This filtering means that, as long as the image exists in that region, we can use “the right one”. So let’s take a look at this file.

resource "aws_key_pair" "my_key" { key_name = "my_key" public_key = "${file("id_rsa.pub")}" } data "aws_ami" "ubuntu_18_04" { most_recent = true filter { name = "name" values = ["ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-*"] } filter { name = "virtualization-type" values = ["hvm"] } owners = ["099720109477"] # Canonical }

So, now we have everything we need to create our VM. Let’s do that!

resource "aws_instance" "main" { ami = "${data.aws_ami.ubuntu_18_04.id}" instance_type = "t2.micro" key_name = "${aws_key_pair.my_key.key_name}" user_data = "${file("CloudDev.sh")}" subnet_id = "${aws_subnet.main.id}" vpc_security_group_ids = [ "${aws_security_group.all_egress.id}", "${aws_security_group.my_ssh.id}" ] tags = [ Name = "MyVM" ] } resource "aws_eip" "main" { instance = "${aws_instance.main.id}" vpc = true tags = [ Name = "MyVM_EIP" ] }

In here, we specify a “user_data” file to upload, in this case, the contents of a file – CloudDev.sh, but you can load anything you want in here. My CloudDev.sh is shown below, so you can see what I’m doing with this file :)

#! /bin/bash sudo apt update 2>&1 | tee -a /home/ubuntu/log sudo apt full-upgrade -y 2>&1 | tee -a /home/ubuntu/log sudo apt install python3-pip git -y 2>&1 | tee -a /home/ubuntu/log sudo -H pip install ansible 2>&1| tee -a /home/ubuntu/log cd /tmp sudo git clone https://github.com/JonTheNiceGuy/clouddev.git clouddev 2>&1 | tee -a /home/ubuntu/log cd clouddev sudo ansible-playbook main.yml 2>&1 | tee -a /home/ubuntu/log touch /home/ubuntu/finished

So, having created all this lot, you need to execute the terraform workload. Initially you do terraform init. This downloads all the provisioners and puts them into the same tree as these .tf files are stored in. It also resets the state of the terraform discovered or created datastore.

Next, you do terraform plan -out tfout. Technically, the tfout part can be any filename, but having something like tfout marks it as clearly part of Terraform. This creates the tfout file with the current state, and whatever needs to change in the Terraform state file on it’s next run. Typically, if you don’t use a tfout file within about 20 minutes, it’s probably worth removing it.

Finally, once you’ve run your plan stage, now you need to apply it. In this case you execute terraform apply tfout. This tfout is the same filename you specified in terraform plan. If you don’t include -out tfout on your plan (or even run a plan!) and tfout in your apply, then you can skip the terraform plan stage entirely.

Once you’re done with your environment, use terraform destroy to shut it all down… and enjoy :)

Featured image is “code crunching” by “Ruben Molina” on Flickr and is released under a CC-ND license.

Getting Started with Terraform on Azure

2019-06-122019-06-13 JonTheNiceGuy Leave a comment

I’m strongly in the “Ansible is my tool, what needs fixing” camp, when it comes to Infrastructure as Code (IaC) but, I know there are other tools out there which are equally as good. I’ve been strongly advised to take a look at Terraform from HashiCorp. I’m most familiar at the moment with Azure, so this is going to be based around resources available on Azure.

Late edit: I want to credit my colleague, Pete, for his help getting started with this. While many of the code samples have been changed from what he provided me with, if it hadn’t been for these code samples in the first place, I’d never have got started!

Late edit 2: This post was initially based on Terraform 0.11, and I was prompted by another colleague, Jon, that the available documentation still follows the 0.11 layout. 0.12 was released in May, and changes how variables are reused in the code. This post now *should* follow the 0.12 conventions, but if you spot something where it doesn’t, check out this post from the Terraform team.

As with most things, there’s a learning curve, and I struggled to find a “simple” getting started guide for Terraform. I’m sure this is a failing on my part, but I thought it wouldn’t hurt to put something out there for others to pick up and see if it helps someone else (and, if that “someone else” is you, please let me know in the comments!)

Pre-requisites

You need an Azure account for this. This part is very far outside my spectrum of influence, but I’m assuming you’ve got one. If not, look at something like Digital Ocean, AWS or VMWare :) For my “controller”, I’m using Windows Subsystem for Linux (WSL), and wrote the following notes about getting my pre-requisites.

# Pre-requsites ``` mkdir -p ~/bin cd ~/bin sudo apt update && sudo apt install unzip ``` # Install AzureCLI [Source](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli-apt) ``` curl -sL https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/microsoft.asc.gpg > /dev/null echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/azure-cli.list > /dev/null sudo apt update && sudo apt install azure-cli ``` # Install Kubectl in WSL [Source](https://kubernetes.io/docs/tasks/tools/install-kubectl/#install-kubectl-binary-with-curl-on-linux) ``` curl -sLO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl && chmod 755 kubectl ``` # Install Terraform in WSL [Source](https://techcommunity.microsoft.com/t5/Azure-Developer-Community-Blog/Configuring-Terraform-on-Windows-10-Linux-Sub-System/ba-p/393845) ``` curl -sLO $(curl https://www.terraform.io/downloads.html | grep "linux_amd64.zip" | cut -d\" -f 2) && unzip terraform*.zip && rm terraform*.zip && chmod 755 terraform ``` # Install terraform extension for VSCode https://marketplace.visualstudio.com/items?itemName=mauve.terraform # Define bash as Default VSCode Shell [Source](https://code.visualstudio.com/docs/editor/integrated-terminal) 1. ctrl+shift+p (Command palete) 1. Type: `default shell` and select `Terminal: Select Default Shell` 1. Choose "WSL Bash"

Building the file structure

One quirk with Terraform, versus other tools like Ansible, is that when you run one of the terraform commands (like terraform init, terraform plan or terraform apply), it reads the entire content of any file suffixed “tf” in that directory, so if you don’t want a file to be loaded, you need to either move it out of the directory, comment it out, or rename it so it doesn’t end .tf. By convention, you normally have three “standard” files in a terraform directory – main.tf, variables.tf and output.tf, but logically speaking, you could have everything in a single file, or each instruction in it’s own file. Because this is a relatively simple script, I’ll use this standard layout.

The actions I’ll be performing are the “standard” steps you’d perform in Azure to build a single Infrastructure as a Service (IAAS) server service:

Create your Resource Group (RG)
Create a Virtual Network (VNET)
Create a Subnet
Create a Security Group (SG) and rules
Create a Public IP address (PubIP) with a DNS name associated to that IP.
Create a Network Interface (NIC)
Create a Virtual Machine (VM), supplying a username and password, the size of disks and VM instance, and any post-provisioning instructions (yep, I’m using Ansible for that :) ).

I’m using Visual Studio Code, but almost any IDE will have integrations for Terraform. The main thing I’m using it for is auto-completion of resource, data and output types, also the fact that control+clicking resource types opens your browser to the documentation page on terraform.io.

So, creating my main.tf, I start by telling it that I’m working with the Terraform AzureRM Provider (the bit of code that can talk Azure API).

This simple statement is enough to get Terraform to load the AzureRM, but it still doesn’t tell Terraform how to get access to the Azure account. Use az login from a WSL shell session to authenticate.

Next, we create our basic resource, vnet and subnet resources.

resource "azurerm_resource_group" "rg" { name = var.resource_group_name location = var.location } resource "azurerm_virtual_network" "vnet" { name = var.vnet_name location = azurerm_resource_group.rg.location resource_group_name = azurerm_resource_group.rg.name address_space = ["10.0.0.0/16"] dns_servers = ["8.8.8.8", "8.8.4.4"] } resource "azurerm_subnet" "subnet" { name = var.subnet_name resource_group_name = azurerm_resource_group.rg.name virtual_network_name = azurerm_virtual_network.vnet.name address_prefix = "10.0.1.0/24" }

But wait, I hear you cry, what are those var.something bits in there? I mentioned before that in the “standard” set of files is a “variables.tf” file. In here, you specify values for later consumption. I have recorded variables for the resource group name and location, as well as the VNet name and subnet name. Let’s add those into variables.tf.

variable resource_group_name { default = "MFIOT201906" } variable vnet_name { default = "MFIOT201906_vnet" } variable subnet_name { default = "MFIOT201906_subnet" } variable location { default = "UK South" }

When you’ve specified a resource, you can capture any of the results from that resource to use later – either in the main.tf or in the output.tf files. By creating the resource group (called “rg” here, but you can call it anything from “demo” to “myfirstresourcegroup”), we can consume the name or location with azurerm_resource_group.rg.name and azurerm_resource_group.rg.location, and so on. In the above code, we use the VNet name in the subnet, and so on.

After the subnet is created, we can start adding the VM specific parts – a security group (with rules), a public IP (with DNS name) and a network interface. I’ll create the VM itself later. So, let’s do this.

resource "azurerm_network_security_group" "iaasnsg" { name = "iaas-nsg" location = azurerm_resource_group.rg.location resource_group_name = azurerm_resource_group.rg.name } resource "azurerm_network_security_rule" "iaasnsgr" { name = "iaas-nsg-100" priority = 100 direction = "Inbound" access = "Allow" protocol = "Tcp" source_port_range = "*" destination_port_range = "22" source_address_prefix = "${trimspace(data.http.icanhazip.body)}/32" destination_address_prefix = "*" resource_group_name = azurerm_resource_group.rg.name network_security_group_name = azurerm_network_security_group.iaasnsg.name } resource "azurerm_public_ip" "iaaspubip" { name = "iaas-pubip" location = azurerm_resource_group.rg.location resource_group_name = azurerm_resource_group.rg.name allocation_method = "Dynamic" domain_name_label = var.dns_prefix } resource "azurerm_network_interface" "iaasnic" { name = "iaas-nic" location = azurerm_resource_group.rg.location resource_group_name = azurerm_resource_group.rg.name network_security_group_id = azurerm_network_security_group.iaasnsg.id ip_configuration { name = "iaas-nic-ip" subnet_id = azurerm_subnet.subnet.id private_ip_address_allocation = "Dynamic" public_ip_address_id = azurerm_public_ip.iaaspubip.id } }

BUT WAIT, what’s that ${trimspace(data.http.icanhazip.body)}/32 bit there?? Any resources we want to load from the terraform state, but that we’ve not directly defined ourselves needs to come from somewhere. These items are classed as “data” – that is, we want to know what their values are, but we aren’t *changing* the service to get it. You can also use this to import other resource items, perhaps a virtual network that is created by another team, or perhaps your account doesn’t have the rights to create a resource group. I’ll include a commented out data block in the overall main.tf file for review that specifies a VNet if you want to see how that works.

In this case, I want to put the public IP address I’m coming from into the NSG Rule, so I can get access to the VM, without opening it up to *everyone*. I’m not that sure that my IP address won’t change between one run and the next, so I’m using the icanhazip.com service to determine my IP address. But I’ve not defined how to get that resource yet. Let’s add it to the main.tf for now.

So, we’re now ready to create our virtual machine. It’s quite a long block, but I’ll pull certain elements apart once I’ve pasted this block in.

resource "azurerm_virtual_machine" "main" { name = "iaas-vm" location = azurerm_resource_group.rg.location resource_group_name = azurerm_resource_group.rg.name network_interface_ids = [azurerm_network_interface.iaasnic.id] vm_size = "Standard_DS1_v2" delete_os_disk_on_termination = true delete_data_disks_on_termination = true storage_image_reference { publisher = "Canonical" offer = "UbuntuServer" sku = "18.04-LTS" version = "latest" } storage_os_disk { name = "iaas-os-disk" caching = "ReadWrite" create_option = "FromImage" managed_disk_type = "Standard_LRS" } os_profile { computer_name = "iaas" admin_username = var.ssh_user admin_password = var.ssh_password } os_profile_linux_config { disable_password_authentication = false } provisioner "remote-exec" { inline = ["mkdir /tmp/ansible"] connection { type = "ssh" host = azurerm_public_ip.iaaspubip.fqdn user = var.ssh_user password = var.ssh_password } } provisioner "file" { source = "ansible/" destination = "/tmp/ansible" connection { type = "ssh" host = azurerm_public_ip.iaaspubip.fqdn user = var.ssh_user password = var.ssh_password } } provisioner "remote-exec" { inline = [ "sudo apt update > /tmp/apt_update || cat /tmp/apt_update", "sudo apt install -y python3-pip > /tmp/apt_install_python3_pip || cat /tmp/apt_install_python3_pip", "sudo -H pip3 install ansible > /tmp/pip_install_ansible || cat /tmp/pip_install_ansible", "ansible-playbook /tmp/ansible/main.yml" ] connection { type = "ssh" host = azurerm_public_ip.iaaspubip.fqdn user = var.ssh_user password = var.ssh_password } } }

So, this is broken into four main pieces.

Virtual Machine Details. This part is relatively sensible. Name RG, location, NIC, Size and what happens to the disks when the machine powers on. OK.

name                             = "iaas-vm"
location                         = azurerm_resource_group.rg.location
resource_group_name              = azurerm_resource_group.rg.name
network_interface_ids            = [azurerm_network_interface.iaasnic.id]
vm_size                          = "Standard_DS1_v2"
delete_os_disk_on_termination    = true
delete_data_disks_on_termination = true

Disk details.

storage_image_reference {
  publisher = "Canonical"
  offer     = "UbuntuServer"
  sku       = "18.04-LTS"
  version   = "latest"
}
storage_os_disk {
  name              = "iaas-os-disk"
  caching           = "ReadWrite"
  create_option     = "FromImage"
  managed_disk_type = "Standard_LRS"
}

OS basics: VM Hostname, username of the first user, and it’s password. Note, if you want to use an SSH key, this must be stored for Terraform to use without passphrase. If you mention an SSH key here, as well as a password, this can cause all sorts of connection issues, so pick one or the other.

os_profile {
  computer_name  = "iaas"
  admin_username = var.ssh_user
  admin_password = var.ssh_password
}
os_profile_linux_config {
  disable_password_authentication = false
}

And lastly, provisioning. I want to use Ansible for my provisioning. In this example, I have a basic playbook stored locally on my Terraform host, which I transfer to the VM, install Ansible via pip, and then execute ansible-playbook against the file I uploaded. This could just as easily be a git repo to clone or a shell script to copy in, but this is a “simple” example.

provisioner "remote-exec" {
  inline = ["mkdir /tmp/ansible"]

  connection {
    type     = "ssh"
    host     = azurerm_public_ip.iaaspubip.fqdn
    user     = var.ssh_user
    password = var.ssh_password
  }
}

provisioner "file" {
  source = "ansible/"
  destination = "/tmp/ansible"

  connection {
    type     = "ssh"
    host     = azurerm_public_ip.iaaspubip.fqdn
    user     = var.ssh_user
    password = var.ssh_password
  }
}

provisioner "remote-exec" {
  inline = [
    "sudo apt update > /tmp/apt_update || cat /tmp/apt_update",
    "sudo apt install -y python3-pip > /tmp/apt_install_python3_pip || cat /tmp/apt_install_python3_pip",
    "sudo -H pip3 install ansible > /tmp/pip_install_ansible || cat /tmp/pip_install_ansible",
    "ansible-playbook /tmp/ansible/main.yml"
  ]

  connection {
    type     = "ssh"
    host     = azurerm_public_ip.iaaspubip.fqdn
    user     = var.ssh_user
    password = var.ssh_password
  }
}

This part of code is done in three parts – create upload path, copy the files in, and then execute it. If you don’t create the upload path, it’ll upload just the first file it comes to into the path specified.

Each remote-exec and file provisioner statement must include the hostname, username and either the password, or SSH private key. In this example, I provide just the password.

When I ran this, with a handful of changes to the variable files, I got this result:

$ terraform init Initializing the backend... Initializing provider plugins... The following providers do not have any version constraints in configuration, so the latest version was installed. To prevent automatic upgrades to new major versions that may contain breaking changes, it is recommended to add version = "..." constraints to the corresponding provider blocks in configuration, with the constraint strings suggested below. * provider.azurerm: version = "~> 1.30" * provider.http: version = "~> 1.1" Terraform has been successfully initialized! You may now begin working with Terraform. Try running "terraform plan" to see any changes that are required for your infrastructure. All Terraform commands should now work. If you ever set or change modules or backend configuration for Terraform, rerun this command to reinitialize your working directory. If you forget, other commands will detect it and remind you to do so if necessary.

$ terraform plan -out tfout Refreshing Terraform state in-memory prior to plan... [314/486] The refreshed state will be used to calculate this plan, but will not be persisted to local or remote state storage. data.http.icanhazip: Refreshing state... ------------------------------------------------------------------------ An execution plan has been generated and is shown below. Resource actions are indicated with the following symbols: + create Terraform will perform the following actions: # azurerm_network_interface.iaasnic will be created + resource "azurerm_network_interface" "iaasnic" { + applied_dns_servers = (known after apply) + dns_servers = (known after apply) + enable_accelerated_networking = false + enable_ip_forwarding = false + id = (known after apply) + internal_dns_name_label = (known after apply) + internal_fqdn = (known after apply) + location = "uksouth" + mac_address = (known after apply) + name = "iaas-nic" + network_security_group_id = (known after apply) + private_ip_address = (known after apply) + private_ip_addresses = (known after apply) + resource_group_name = "20190611_JS_RG" + tags = (known after apply) + virtual_machine_id = (known after apply) + ip_configuration { + application_gateway_backend_address_pools_ids = (known after apply) + application_security_group_ids = (known after apply) + load_balancer_backend_address_pools_ids = (known after apply) + load_balancer_inbound_nat_rules_ids = (known after apply) + name = "iaas-nic-ip" + primary = (known after apply) + private_ip_address_allocation = "dynamic" + private_ip_address_version = "IPv4" + public_ip_address_id = (known after apply) + subnet_id = (known after apply) } } # azurerm_network_security_group.iaasnsg will be created + resource "azurerm_network_security_group" "iaasnsg" { + id = (known after apply) + location = "uksouth" + name = "iaas-nsg" + resource_group_name = "20190611_JS_RG" + security_rule = (known after apply) + tags = (known after apply) } # azurerm_network_security_rule.iaasnsgr will be created + resource "azurerm_network_security_rule" "iaasnsgr" { + access = "Allow" + destination_address_prefix = "*" + destination_port_range = "22" + direction = "Inbound" + id = (known after apply) + name = "iaas-nsg-100" [250/486] + network_security_group_name = "iaas-nsg" + priority = 100 + protocol = "Tcp" + resource_group_name = "20190611_JS_RG" + source_address_prefix = "89.101.76.85/32" + source_port_range = "*" } # azurerm_public_ip.iaaspubip will be created + resource "azurerm_public_ip" "iaaspubip" { + allocation_method = "Dynamic" + domain_name_label = "js-20190611-iaas-demo" + fqdn = (known after apply) + id = (known after apply) + idle_timeout_in_minutes = 4 + ip_address = (known after apply) + ip_version = "IPv4" + location = "uksouth" + name = "iaas-pubip" + public_ip_address_allocation = (known after apply) + resource_group_name = "20190611_JS_RG" + sku = "Basic" + tags = (known after apply) } # azurerm_resource_group.rg will be created + resource "azurerm_resource_group" "rg" { + id = (known after apply) + location = "uksouth" + name = "20190611_JS_RG" + tags = (known after apply) } # azurerm_subnet.subnet will be created + resource "azurerm_subnet" "subnet" { + address_prefix = "10.0.1.0/24" + id = (known after apply) + ip_configurations = (known after apply) + name = "20190611_JS_subnet" + resource_group_name = "20190611_JS_RG" + virtual_network_name = "20190611_JS_vnet" } # azurerm_virtual_machine.main will be created + resource "azurerm_virtual_machine" "main" { + availability_set_id = (known after apply) + delete_data_disks_on_termination = true + delete_os_disk_on_termination = true + id = (known after apply) + license_type = (known after apply) + location = "uksouth" + name = "iaas-vm" + network_interface_ids = (known after apply) + resource_group_name = "20190611_JS_RG" + tags = (known after apply) + vm_size = "Standard_DS1_v2" + identity { + identity_ids = (known after apply) + principal_id = (known after apply) + type = (known after apply) } + os_profile { + admin_password = (sensitive value) + admin_username = "tf_admin" + computer_name = "iaas" + custom_data = (known after apply) } + os_profile_linux_config { + disable_password_authentication = false } + storage_data_disk { + caching = (known after apply) + create_option = (known after apply) + disk_size_gb = (known after apply) + lun = (known after apply) + managed_disk_id = (known after apply) + managed_disk_type = (known after apply) + name = (known after apply) + vhd_uri = (known after apply) + write_accelerator_enabled = (known after apply) } + storage_image_reference { + offer = "UbuntuServer" + publisher = "Canonical" + sku = "18.04-LTS" + version = "latest" } + storage_os_disk { + caching = "ReadWrite" + create_option = "FromImage" + disk_size_gb = (known after apply) + managed_disk_id = (known after apply) + managed_disk_type = "Standard_LRS" + name = "iaas-os-disk" + os_type = (known after apply) + write_accelerator_enabled = false } } # azurerm_virtual_network.vnet will be created [144/486] + resource "azurerm_virtual_network" "vnet" { + address_space = [ + "10.0.0.0/16", ] + dns_servers = [ + "8.8.8.8", + "8.8.4.4", ] + id = (known after apply) + location = "uksouth" + name = "20190611_JS_vnet" + resource_group_name = "20190611_JS_RG" + tags = (known after apply) + subnet { + address_prefix = (known after apply) + id = (known after apply) + name = (known after apply) + security_group = (known after apply) } } Plan: 8 to add, 0 to change, 0 to destroy. ------------------------------------------------------------------------ This plan was saved to: tfout To perform exactly these actions, run the following command to apply: terraform apply "tfout"

terraform apply tfout azurerm_resource_group.rg: Creating... azurerm_resource_group.rg: Creation complete after 1s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906] azurerm_network_security_group.iaasnsg: Creating... azurerm_virtual_network.vnet: Creating... azurerm_public_ip.iaaspubip: Creating... azurerm_public_ip.iaaspubip: Creation complete after 4s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Network/publicIPAddresses/iaas-pubip] azurerm_network_security_group.iaasnsg: Still creating... [11s elapsed] azurerm_virtual_network.vnet: Still creating... [11s elapsed] azurerm_virtual_network.vnet: Creation complete after 11s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Network/virtualNetworks/MFIOT201906_vnet] azurerm_network_security_group.iaasnsg: Creation complete after 12s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Network/networkSecurityGroups/iaas-nsg] azurerm_network_security_rule.iaasnsgr: Creating... azurerm_subnet.subnet: Creating... azurerm_network_security_rule.iaasnsgr: Creation complete after 1s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Network/networkSecurityGroups/iaas-nsg/securityRules/iaas-nsg-100] azurerm_subnet.subnet: Still creating... [10s elapsed] azurerm_subnet.subnet: Creation complete after 10s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Network/virtualNetworks/MFIOT201906_vnet/subnets/MFIOT201906_subnet] azurerm_network_interface.iaasnic: Creating... azurerm_network_interface.iaasnic: Creation complete after 0s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Network/networkInterfaces/iaas-nic] azurerm_virtual_machine.main: Creating... azurerm_virtual_machine.main: Still creating... [10s elapsed] azurerm_virtual_machine.main: Still creating... [20s elapsed] azurerm_virtual_machine.main: Still creating... [30s elapsed] azurerm_virtual_machine.main: Provisioning with 'remote-exec'... azurerm_virtual_machine.main (remote-exec): Connecting to remote host via SSH... azurerm_virtual_machine.main (remote-exec): Host: myfirstiaasonterraform.uksouth.cloudapp.azure.com azurerm_virtual_machine.main (remote-exec): User: tf_admin azurerm_virtual_machine.main (remote-exec): Password: true azurerm_virtual_machine.main (remote-exec): Private key: false azurerm_virtual_machine.main (remote-exec): Certificate: false azurerm_virtual_machine.main (remote-exec): SSH Agent: true azurerm_virtual_machine.main (remote-exec): Checking Host Key: false azurerm_virtual_machine.main (remote-exec): Connecting to remote host via SSH... azurerm_virtual_machine.main (remote-exec): Host: myfirstiaasonterraform.uksouth.cloudapp.azure.com azurerm_virtual_machine.main (remote-exec): User: tf_admin azurerm_virtual_machine.main (remote-exec): Password: true azurerm_virtual_machine.main (remote-exec): Private key: false azurerm_virtual_machine.main (remote-exec): Certificate: false azurerm_virtual_machine.main (remote-exec): SSH Agent: true azurerm_virtual_machine.main (remote-exec): Checking Host Key: false azurerm_virtual_machine.main (remote-exec): Connecting to remote host via SSH... azurerm_virtual_machine.main (remote-exec): Host: myfirstiaasonterraform.uksouth.cloudapp.azure.com azurerm_virtual_machine.main (remote-exec): User: tf_admin azurerm_virtual_machine.main (remote-exec): Password: true azurerm_virtual_machine.main (remote-exec): Private key: false azurerm_virtual_machine.main (remote-exec): Certificate: false azurerm_virtual_machine.main (remote-exec): SSH Agent: true azurerm_virtual_machine.main (remote-exec): Checking Host Key: false azurerm_virtual_machine.main: Still creating... [40s elapsed] azurerm_virtual_machine.main (remote-exec): Connecting to remote host via SSH... azurerm_virtual_machine.main (remote-exec): Host: myfirstiaasonterraform.uksouth.cloudapp.azure.com azurerm_virtual_machine.main (remote-exec): User: tf_admin azurerm_virtual_machine.main (remote-exec): Password: true azurerm_virtual_machine.main (remote-exec): Private key: false azurerm_virtual_machine.main (remote-exec): Certificate: false azurerm_virtual_machine.main (remote-exec): SSH Agent: true azurerm_virtual_machine.main (remote-exec): Checking Host Key: false azurerm_virtual_machine.main (remote-exec): Connected! azurerm_virtual_machine.main: Provisioning with 'file'... azurerm_virtual_machine.main: Provisioning with 'remote-exec'... azurerm_virtual_machine.main (remote-exec): Connecting to remote host via SSH... azurerm_virtual_machine.main (remote-exec): Host: myfirstiaasonterraform.uksouth.cloudapp.azure.com azurerm_virtual_machine.main (remote-exec): User: tf_admin azurerm_virtual_machine.main (remote-exec): Password: true azurerm_virtual_machine.main (remote-exec): Private key: false azurerm_virtual_machine.main (remote-exec): Certificate: false azurerm_virtual_machine.main (remote-exec): SSH Agent: true azurerm_virtual_machine.main (remote-exec): Checking Host Key: false azurerm_virtual_machine.main (remote-exec): Connected! azurerm_virtual_machine.main: Still creating... [50s elapsed] azurerm_virtual_machine.main (remote-exec): WARNING: apt does not have a stable CLI interface. Use with caution in scripts. azurerm_virtual_machine.main (remote-exec): WARNING: apt does not have a stable CLI interface. Use with caution in scripts. azurerm_virtual_machine.main: Still creating... [1m0s elapsed] azurerm_virtual_machine.main (remote-exec): Extracting templates from packages: 47% azurerm_virtual_machine.main (remote-exec): Extracting templates from packages: 95% azurerm_virtual_machine.main (remote-exec): Extracting templates from packages: 100% azurerm_virtual_machine.main: Still creating... [1m10s elapsed] azurerm_virtual_machine.main: Still creating... [1m20s elapsed] azurerm_virtual_machine.main: Still creating... [1m30s elapsed] azurerm_virtual_machine.main: Still creating... [1m40s elapsed] azurerm_virtual_machine.main: Still creating... [1m50s elapsed] azurerm_virtual_machine.main: Still creating... [2m0s elapsed] azurerm_virtual_machine.main: Still creating... [2m10s elapsed] azurerm_virtual_machine.main: Still creating... [2m20s elapsed] azurerm_virtual_machine.main: Still creating... [2m30s elapsed] azurerm_virtual_machine.main: Still creating... [2m40s elapsed] azurerm_virtual_machine.main (remote-exec): [WARNING]: No inventory was parsed, only implicit localhost is available azurerm_virtual_machine.main (remote-exec): azurerm_virtual_machine.main (remote-exec): [WARNING]: provided hosts list is empty, only localhost is available. Note azurerm_virtual_machine.main (remote-exec): that the implicit localhost does not match 'all' azurerm_virtual_machine.main (remote-exec): azurerm_virtual_machine.main (remote-exec): PLAY [localhost] *************************************************************** azurerm_virtual_machine.main (remote-exec): TASK [debug] ******************************************************************* azurerm_virtual_machine.main (remote-exec): ok: [localhost] => { azurerm_virtual_machine.main (remote-exec): "msg": "Hello world!" azurerm_virtual_machine.main (remote-exec): } azurerm_virtual_machine.main (remote-exec): PLAY RECAP ********************************************************************* azurerm_virtual_machine.main (remote-exec): localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 azurerm_virtual_machine.main: Creation complete after 2m50s [id=/subscriptions/decafbad-1234-abcd-5678-abcdef123456/resourceGroups/MFIOT201906/providers/Microsoft.Compute/virtualMachines/iaas-vm] Apply complete! Resources: 8 added, 0 changed, 0 destroyed. The state of your infrastructure has been saved to the path below. This state is required to modify and destroy your infrastructure, so keep it safe. To inspect the complete state use the `terraform show` command. State path: terraform.tfstate Outputs: host = myfirstiaasonterraform.uksouth.cloudapp.azure.com

Once you’re done with your environment, use terraform destroy to shut it all down… and enjoy :)

The full source is available in the associated Gist. Pull requests and constructive criticism are very welcome!

Featured image is “Seca” by “Olearys” on Flickr and is released under a CC-BY license.

Run an Ansible Playbook against a Check Point Gaia node running R80+

2019-06-052019-11-05 JonTheNiceGuy Leave a comment

Late Edit – 2019-11-05: Ansible 2.9 has some Check Point modules for interacting with the Check Point Manager API which are actually Idempotent, and if you’re running Ansible <=2.8, there are some non-idempotent modules available directly from Check Point. This post is about interacting with the OS. The OS might now be much more addressable using ansible_connection=ssh!

In Check Point Gaia R77, if you wanted to run Ansible against this node, you were completely out of luck. The version of Python on the host was broken, modules were missing and … well, it just wouldn’t work.

Today, I’m looking at running some Ansible playbooks against Check Point R80 nodes. Here’s some steps you need to get through to make it work.

Make sure the user that Ansible is going to be using has the shell /bin/bash. If you don’t have this set up, the command is: set user ansible shell /bin/bash.
If you want a separate user account to do ansible actions, run these commands:
add user ansible uid 9999 homedir /home/ansible
set user ansible password-hash $1$D3caF9$B4db4Ddecafbadnogoood (note this hash is not valid!)
add rba user ansible roles adminRole
set user ansible shell /bin/bash
Make sure your inventory specifies the right path for your Python binary. In the next code block you’ll see my inventory for three separate Check Point R80+ nodes. Note that I’ll only be targetting the “checkpoint” group, but that I’m using the r80_10, r80_20 and r80_30 groups to load the variables into there. I could, alternatively, add these in as values in group_vars/r80_10.yml and so on, but I find keeping everything to do with my connection in one place much cleaner. The python interpreter is in a separate path for each version time, and if you don’t specify ansible_ssh_transfer_method=piped you’ll get a message like this: [WARNING]: sftp transfer mechanism failed on [cpr80-30]. Use ANSIBLE_DEBUG=1 to see detailed information (fix from Add pipeline-ish method using dd for file transfer over SSH (#18642) on the Ansible git repo)

[checkpoint]
cpr80-10        ansible_user=admin      ansible_password=Sup3rS3cr3t-
cpr80-20        ansible_user=admin      ansible_password=Sup3rS3cr3t-
cpr80-30        ansible_user=admin      ansible_password=Sup3rS3cr3t-

[r80_10]
cpr80-10

[r80_20]
cpr80-20

[r80_30]
cpr80-30

[r80_10:vars]
ansible_ssh_transfer_method=piped
ansible_python_interpreter=/opt/CPsuite-R80/fw1/Python/bin/python

[r80_20:vars]
ansible_ssh_transfer_method=piped
ansible_python_interpreter=/opt/CPsuite-R80.20/fw1/Python/bin/python

[r80_30:vars]
ansible_ssh_transfer_method=piped
ansible_python_interpreter=/opt/CPsuite-R80.30/fw1/Python/bin/python

And there you have it, one quick “ping” check later…

$ ansible -m 'ping' -i hosts checkpoint
cpr80-10 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
cpr80-30 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
cpr80-20 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

One quick word of warning though, don’t use gather_facts: true or the setup: module. Both of these still rely on missing libraries on the Check Point nodes, and won’t work… But then again, you can get whatever you need from shell commands….. right? ;)

Featured image is “Untitled” by “Ryan Dickey” on Flickr and is released under a CC-BY license.

Building a Gitlab and Ansible Tower (AWX) Demo in Vagrant with Ansible

2019-05-162019-05-16 JonTheNiceGuy Leave a comment

TL;DR – I created a repository on GitHub ‌ containing a Vagrantfile and an Ansible Playbook to build a VM running Docker. That VM hosts AWX (Ansible Tower’s upstream open-source project) and Gitlab.

A couple of years ago, a colleague created (and I enhanced) a Vagrant and Ansible playbook called “Project X” which would run an AWX instance in a Virtual Machine. It’s a bit heavy, and did a lot of things to do with persistence that I really didn’t need, so I parked my changes and kept an eye on his playbook…

Fast-forward to a week-or-so ago. I needed to explain what a Git/Ansible Workflow would look like, and so I went back to look at ProjectX. Oh my, it looks very complex and consumed a lot of roles that, historically, I’ve not been that impressed with… I just needed the basics to run AWX. Oh, and I also needed a Gitlab environment.

I knew that Gitlab had a docker-based install, and so does AWX, so I trundled off to find some install guides. These are listed in the playbook I eventually created (hence not listing them here). Not all the choices I made were inspired by those guides – I wanted to make quite a bit of this stuff “build itself”… this meant I wanted users, groups and projects to be created in Gitlab, and users, projects, organisations, inventories and credentials to be created in AWX.

I knew that you can create Docker Containers in Ansible, so after I’d got my pre-requisites built (full upgrade, docker installed, pip libraries installed), I add the gitlab-ce :latest docker image, and expose some ports. Even now, I’m not getting the SSH port mapped that I was expecting, but … it’s no disaster.

I did notice that the Gitlab service takes ages to start once the container is marked as running, so I did some more digging, and found that the uri module can be used to poll a URL. It wasn’t documented well how you can make it keep polling until you get the response you want, so … I added a PR on the Ansible project’s github repo for that one (and I also wrote a blog post about that earlier too).

Once I had a working Gitlab service, I needed to customize it. There are a bunch of Gitlab modules in Ansible but since a few releases back of Gitlab, these don’t work any more, so I had to find a different way. That different way was to run an internal command called “gitlab-rails”. It’s not perfect (so it doesn’t create repos in your projects) but it’s pretty good at giving you just enough to build your demo environment. So that’s getting Gitlab up…

Now I need to build AWX. There’s lots of build guides for this, but actually I had most luck using the README in their repository (I know, who’d have thought it!??!) There are some “Secrets” that should be changed in production that I’m changing in my script, but on the whole, it’s pretty much a vanilla install.

Unlike the Gitlab modules, the Ansible Tower modules all work, so I use these to create the users, credentials and so-on. Like the gitlab-rails commands, however, the documentation for using the tower modules is pretty ropey, and I still don’t have things like “getting your users to have access to your organisation” working from the get-go, but for the bulk of the administration, it does “just work”.

Like all my playbooks, I use group_vars to define the stuff I don’t want to keep repeating. In this demo, I’ve set all the passwords to “Passw0rd”, and I’ve created 3 users in both AWX and Gitlab – csa, ops and release – indicative of the sorts of people this demo I ran was aimed at – Architects, Operations and Release Managers.

Maybe, one day, I’ll even be able to release the presentation that went with the demo ;)

On a more productive note, if you’re doing things with the tower_ modules and want to tell me what I need to fix up, or if you’re doing awesome things with the gitlab-rails tool, please visit the repo with this automation code in, and take a look at some of my “todo” items! Thanks!!

Featured image is “Tower” by “Yijun Chen” on Flickr and is released under a CC-BY-SA license.

"funfair action" by "Jon Bunting" on Flickr

Improving the speed of Azure deployments in Ansible with Async

2019-05-152019-05-15 JonTheNiceGuy Leave a comment

Recently I was building a few environments in Azure using Ansible, and found this stanza which helped me to speed things up.

  - name: "Schedule UDR Creation"
    azure_rm_routetable:
      resource_group: "{{ resource_group }}"
      name: "{{ item.key }}_udr"
    loop: "{{ routetables | dict2items }}"
    loop_control:
        label: "{{ item.key }}_udr"
    async: 1000
    poll: 0
    changed_when: False
    register: sleeper

  - name: "Check UDRs Created"
    async_status:
      jid: "{{ item.ansible_job_id }}"
    register: sleeper_status
    until: sleeper_status.finished
    retries: 500
    delay: 4
    loop: "{{ sleeper.results|flatten(levels=1) }}"
    when: item.ansible_job_id is defined
    loop_control:
      label: "{{ item._ansible_item_label }}"

What we do here is to start an action with an “async” time (to give the Schedule an opportunity to register itself) and a “poll” time of 0 (to prevent the Schedule from waiting to be finished). We then tell it that it’s “never changed” (changed_when: False) because otherwise it always shows as changed, and to register the scheduled item itself as a “sleeper”.

After all the async jobs get queued, we then check the status of all the scheduled items with the async_status module, passing it the registered job ID. This lets me spin up a lot more items in parallel, and then “just” confirm afterwards that they’ve been run properly.

It’s not perfect, and it can make for rather messy code. But, it does work, and it’s well worth giving it the once over, particularly if you’ve got some slow-to-run tasks in your playbook!

Featured image is “funfair action” by “Jon Bunting” on Flickr and is released under a CC-BY license.

Share this:

Share this:

Share this:

Share this:

Share this:

What do you need?

Starting coding your infrastructure

Share this:

Pre-requisites

Building the file structure

Share this:

Share this:

Share this:

Share this: