"Observatories Combine to Crack Open the Crab Nebula" by "NASA Goddard Space Flight Center" on Flickr

Nebula Offline Certificate Management with a Raspberry Pi using Bash

I have been playing again, recently, with Nebula, an Open Source Peer-to-Peer VPN product which boasts speed, simplicity and in-built firewalling. Although I only have a few nodes to play with (my VPS, my NAS, my home server and my laptop), I still wanted to simplify, for me, the process of onboarding devices. So, naturally, I spent a few evenings writing a bash script that helps me to automate the creation of my Nebula nodes.

Nebula Certificates

Nebula have implemented their own certificate structure. It’s similar to an x509 “TLS Certificate” (like you’d use to access an HTTPS website, or to establish an OpenVPN connection), but has a few custom fields.

The result of typing “nebula-cert print -path ca.crt” to print the custom fields

In this context, I’ve created a nebula Certificate Authority (CA), using this command:

nebula-cert ca -name nebula.example.org -ips 192.0.2.0/24,198.51.100.0/24,203.0.113.0/24 -groups Mobile,Workstation,Server,Lighthouse,db

So, what does this do?

Well, it creates the certificate and private key files, storing the name for the CA as “nebula.example.org” (there’s a reason for this!) and limiting the subnets and groups (like AWS or Azure Tags) the CA can issue certificates with.

Here, I’ve limited the CA to only issue IP addresses in the RFC5737 “Documentation” ranges, which are 192.0.2.0/24, 198.51.100.0/24 and 203.0.113.0/24, but this can easily be expanded to 10.0.0.0/8 or lots of individual subnets (I tested, and proved 1026 separate subnets which worked fine).

Groups, in Nebula parlance, are building blocks of the Security product, and can act like source or destination filters. In this case, I limited the CA to only being allowed to issue certificates with the groups of “Mobile”, “Workstation”, “Server”, “Lighthouse” and “db”.

As this certificate authority requires no internet access, and only enough access to read and write files, I have created my Nebula CA server on a separate Micro SD card to use with a Raspberry Pi device, and this is used only to generate a new CA certificate each 6 months (in theory, I’ve not done this part yet!), and to sign keys for all the client devices as they come on board.

I copy the ca.crt file to my target machines, and then move on to creating my client certificates

Client Certificates

When you generate key materials for Public Key Cryptographic activities (like this one), you’re supposed to generate the private key on the source device, and the private key should never leave the device on which it’s generated. Nebula allows you to do this, using the nebula-cert command again. That command looks like this:

nebula-cert keygen -out-key host.key -out-pub host.pub

If you notice, there’s a key difference at this point between Nebula’s key signing routine, and an x509 TLS style certificate, you see, this stage would be called a “Certificate Signing Request” or CSR in TLS parlance, and it usually would specify the record details for the certificate (normally things like “region”, “organisational unit”, “subject name” and so on) before sending it to the CA for signing (marking it as trusted).

In the Nebula world, you create a key, and send the public part of that (in this case, “host.pub” but it can have any name you like) to the CA, at which point the CA defines what IP addresses it will have, what groups it is in, and so on, so let’s do that.

nebula-cert sign -ca-crt ca.crt -ca-key ca.key -in-pub host.pub -out-crt host.crt -groups Workstation -ip 192.0.2.5/24 -name host.nebula.example.org

Let’s pick apart these options, shall we? The first four flags “-ca-crt“, “-ca-key“, “-in-pub” and “-out-crt” all refer to the CSR process – it’s reading the CA certificate and key, as well as the public part of the keypair created for the process, and then defines what the output certificate will be called. The next switch, -groups, identifies the tags we’re assigning to this node, then (the mandatory flag) -ip sets the IP address allocated to the node. Note that the certificate is using one of the valid group names, and has been allocated a valid IP address address in the ranges defined above. If you provide a value for the certificate which isn’t valid, you’ll get a warning message.

nebula-cert issues a warning when signing a certificate that tries to specify a value outside the constraints of the CA

In the above screenshot, I’ve bypassed the key generation and asked for the CA to sign with values which don’t match the constraints.

The last part is the name of the certificate. This is relevant because Nebula has a DNS service which can resolve the Nebula IPs to the hostnames assigned on the Certificates.

Anyway… Now that we know how to generate certificates the “hard” way, let’s make life a bit easier for you. I wrote a little script – Nebula Cert Maker, also known as certmaker.sh.

certmaker.sh

So, what does certmaker.sh do that is special?

  1. It auto-assigns an IP address, based on the MD5SUM of the FQDN of the node. It uses (by default) the first CIDR mask (the IP range, written as something like 192.0.2.0/24) specified in the CA certificate. If multiple CIDR masks are specified in the certificate, there’s a flag you can use to select which one to use. You can override this to get a specific increment from the network address.
  2. It takes the provided name (perhaps webserver) and adds, as a suffix, the name of the CA Certificate (like nebula.example.org) to the short name, to make the FQDN. This means that you don’t need to run a DNS service for support staff to access machines (perhaps you’ll have webserver1.nebula.example.org and webserver2.nebula.example.org as well as database.nebula.example.org).
  3. Three “standard” roles have been defined for groups, these are “Server”, “Workstation” and “Lighthouse” [1] (the latter because you can configure Lighthouses to be the DNS servers mentioned in step 2.) Additional groups can also be specified on the command line.

[1] A lighthouse, in Nebula terms, is a publically accessible node, either with a static IP, or a DNS name which resolves to a known host, that can help other nodes find each other. Because all the nodes connect to it (or a couple of “it”s) this is a prime place to run the DNS server, as, well, it knows where all the nodes are!

So, given these three benefits, let’s see these in a script. This script is (at least currently) at the end of the README file in that repo.

# Create the CA
mkdir -p /tmp/nebula_ca
nebula-cert ca -out-crt /tmp/nebula_ca/ca.crt -out-key /tmp/nebula_ca/ca.key -ips 192.0.2.0/24,198.51.100.0/24 -name nebula.example.org

# First lighthouse, lighthouse1.nebula.example.org - 192.0.2.1, group "Lighthouse"
./certmaker.sh --cert_path /tmp/nebula_ca --name lighthouse1 --ip 1 --lighthouse

# Second lighthouse, lighthouse2.nebula.example.org - 192.0.2.2, group "Lighthouse"
./certmaker.sh -c /tmp/nebula_ca -n lighthouse2 -i 2 -l

# First webserver, webserver1.nebula.example.org - 192.0.2.168, groups "Server" and "web"
./certmaker.sh --cert_path /tmp/nebula_ca --name webserver1 --server --group web

# Second webserver, webserver2.nebula.example.org - 192.0.2.191, groups "Server" and "web"
./certmaker.sh -c /tmp/nebula_ca -n webserver2 -s -g web

# Database Server, db.nebula.example.org - 192.0.2.182, groups "Server" and "db"
./certmaker.sh --cert_path /tmp/nebula_ca --name db --server --group db

# First workstation, admin1.nebula.example.org - 198.51.100.205, group "Workstation"
./certmaker.sh --cert_path /tmp/nebula_ca --index 1 --name admin1 --workstation

# Second workstation, admin2.nebula.example.org - 198.51.100.77, group "Workstation"
./certmaker.sh -c /tmp/nebula_ca -d 1 -n admin2 -w

# First Mobile device - Create the private/public key pairing first
nebula-cert keygen -out-key mobile1.key -out-pub mobile1.pub
# Then sign it, mobile1.nebula.example.org - 198.51.100.217, group "mobile"
./certmaker.sh --cert_path /tmp/nebula_ca --index 1 --name mobile1 --group mobile --public mobile1.pub

# Second Mobile device - Create the private/public key pairing first
nebula-cert keygen -out-key mobile2.key -out-pub mobile2.pub
# Then sign it, mobile2.nebula.example.org - 198.51.100.22, group "mobile"
./certmaker.sh -c /tmp/nebula_ca -d 1 -n mobile2 -g mobile -p mobile2.pub

Technically, the mobile devices are simulating the local creation of the private key, and the sharing of the public part of that key. It also simulates what might happen in a more controlled environment – not where everything is run locally.

So, let’s pick out some spots where this content might be confusing. I’ve run each type of invocation twice, once with the short version of all the flags (e.g. -c instead of --cert_path, -n instead of --name) and so on, and one with the longer versions. Before each ./certmaker.sh command, I’ve added a comment, showing what the hostname would be, the IP address, and the Nebula Groups assigned to that node.

It is also possible to override the FQDN with your own FQDN, but this command option isn’t in here. Also, if the CA doesn’t provide a CIDR mask, one will be selected for you (10.44.88.0/24), or you can provide one with the -b/--subnet flag.

If the CA has multiple names (e.g. nebula.example.org and nebula.example.com), then the name for the host certificates will be host.nebula.example.org and also host.nebula.example.com.

Using Bash

So, if you’ve looked at, well, almost anything on my site, you’ll see that I like to use tools like Ansible and Terraform to deploy things, but for something which is going to be run on this machine, I’d like to keep things as simple as possible… and there’s not much in this script that needed more than what Bash offers us.

For those who don’t know, bash is the default shell for most modern Linux distributions and Docker containers. It can perform regular expression parsing (checking that strings, or specific collections of characters appear in a variable), mathematics, and perform extensive loop and checks on values.

I used a bash template found on a post at BetterDev.blog to give me a basic structure – usage, logging and parameter parsing. I needed two functions to parse and check whether IP addresses were valid, and what ranges of those IP addresses might be available. These were both found online. To get just enough of the MD5SUM to generate a random IPv4 address, I used a function to convert the hexedecimal number that the MDSUM produces, and then turned that into a decimal number, which I loop around the address space in the subnets. Lastly, I made extensive use of Bash Arrays in this. This was largely thanks to an article on OpenSource.com about bash arrays. It’s well worth a read!

So, take a look at the internals of the script, if you want to know some options on writing bash scripts that manipulate IP addresses and read the output of files!

If you’re looking for some simple tasks to start your portfolio of work, there are some “good first issue” tasks in the “issues” of the repo, and I’d be glad to help you work through them.

Wrap up

I hope you enjoy using this script, and I hope, if you’re planning on writing some bash scripts any time soon, that you take a look over the code and consider using some of the templates I reference.

Featured image is “Observatories Combine to Crack Open the Crab Nebula” by “NASA Goddard Space Flight Center” on Flickr and is released under a CC-BY license.

"Honey pots" by "Nicholas" on Flickr

Adding MITM (or “Trusted Certificate Authorities”) proxy certificates for Linux and Linux-like Environments

In some work environments, you may find that a “Man In The Middle” (also known as MITM) proxy may have been configured to inspect HTTPS traffic. If you work in a predominantly Windows based environment, you may have had some TLS certificates deployed to your computer when you logged in, or by group policy.

I’ve previously mentioned that if you’re using Firefox on your work machines where you’ve had these certificates pushed to your machine, then you’ll need to enable a configuration flag to make those work under Firefox (“security.enterprise_roots.enabled“), but this is talking about Linux (like Ubuntu, Fedora, CentOS, etc.) and Linux-like environments (like WSL, MSYS2)

Late edit 2021-05-06: Following a conversation with SiDoyle, I added some notes at the end of the post about using the System CA path with the Python Requests library. These notes were initially based on a post by Mohclips from several years ago!

Start with Windows

From your web browser of choice, visit any HTTPS web page that you know will be inspected by your proxy.

If you’re using Mozilla Firefox

In Firefox, click on this part of the address bar and click on the right arrow next to “Connection secure”:

Clicking on the Padlock and then clicking on the Right arrow will take you to the “Connection Security” screen.
Certification Root obscured, but this where we prove we have a MITM certificate.

Click on “More Information” to take you to the “Page info” screen

More obscured details, but click on “View Certificate”

In recent versions of Firefox, clicking on “View Certificate” takes you to a new page which looks like this:

Mammoth amounts of obscuring here! The chain runs from left to right, with the right-most blob being the Root Certificate

Click on the right-most tab of this screen, and navigate down to where it says “Miscellaneous”. Click on the link to download the “PEM (cert)”.

The details on the Certificate Authority (highly obscured!), but here is where we get our “Root” Certificate for this proxy.

Save this certificate somewhere sensible, we’ll need it in a bit!

Note that if you’ve got multiple proxies (perhaps for different network paths, or perhaps for a cloud proxy and an on-premises proxy) you might need to force yourself in into several situations to get these.

If you’re using Google Chrome / Microsoft Edge

In Chrome or Edge, click on the same area, and select “Certificate”:

This will take you to a screen listing the “Certification Path”. This is the chain of trust between the “Root” certificate for the proxy to the certificate they issue so I can visit my website:

This screen shows the chain of trust from the top of the chain (the “Root” certificate) to the bottom (the certificate they issued so I could visit this website)

Click on the topmost line of the list, and then click “View Certificate” to see the root certificate. Click on “Details”:

The (obscured) details for the root CA.

Click on “Copy to File” to open the “Certificate Export Wizard”:

In the Certificate Export Wizard, click “Next”
Select “Base-64 encoded X.509 (.CER)” and click “Next”
Click on the “Browse…” button to select a path.
Name the file something sensible, and put the file somewhere you’ll find it shortly. Click “Save”, then click “Next”.

Once you’ve saved this file, rename it to have the extension .pem. You may need to do this from a command line!

Copy the certificate into the environment and add it to the system keychain

Ubuntu or Debian based systems as an OS, or as a WSL environment

As root, copy the proxy’s root key into /usr/local/share/ca-certificates/<your_proxy_name>.crt (for example, /usr/local/share/ca-certificates/proxy.my.corp.crt) and then run update-ca-certificates to update the system-wide certificate store.

RHEL/CentOS as an OS, or as a WSL environment

As root, copy the proxy’s root key into /etc/pki/ca-trust/source/anchors/<your_proxy_name>.pem (for example, /etc/pki/ca-trust/source/anchors/proxy.my.corp.pem) and then run update-ca-trust to update the system-wide certificate store.

MSYS2 or the Ruby Installer

Open the path to your MSYS2 environment (e.g. C:\Ruby30-x64\msys64) using your file manager (Explorer) and run msys2.exe. Then paste the proxy’s root key into the etc/pki/ca-trust/source/anchors subdirectory, naming it <your_proxy_name>.pem. In the MSYS2 window, run update-ca-trust to update the environment-wide certificate store.

If you’ve obtained the Ruby Installer from https://rubyinstaller.org/ and installed it from there, assuming you accepted the default path of C:\Ruby<VERSION>-x64 (e.g. C:\Ruby30-x64) you need to perform the above step (running update-ca-trust) and then copy the file from C:\Ruby30-x64\mysys64\etc\pki\ca-trust\extracted\pem\tls-ca-bundle.pem to C:\Ruby30-x64\ssl\cert.pem

Using the keychain

Most of your Linux and Linux-Like environments will operate fine with this keychain, but for some reason, Python needs an environment variable to be passed to it for this. As I encounter more environments, I’ll update this post!

The path to the system keychain varies between releases, but under Debian based systems, it is: /etc/ssl/certs/ca-certificates.crt while under RedHat based systems, it is: /etc/pki/tls/certs/ca-bundle.crt.

Python “Requests” library

If you’re getting TLS errors in your Python applications, you need the REQUESTS_CA_BUNDLE environment variable set to the path for the system-wide keychain. You may want to add this line to your /etc/profile to include this path.

Sources:

Featured image is “Honey pots” by “Nicholas” on Flickr and is released under a CC-BY license.

"Raven" by "Jim Bahn" on Flickr

Sending SSH login notifications to Matrix via Huginn using Webhooks

On the Self Hosted Podcast’s Discord Server, someone posted a link to the following blog post, which I read and found really interesting…: https://blog.hay-kot.dev/ssh-notifications-in-home-assistant/

You see, the key part of that post wasn’t that they were posting to Home Assistant when they were logging in, but instead that they were triggering a webhook on login. And I can do stuff with Webhooks.

What’s a webhook?

A webhook is a callable URL, either with a “secret” embedded in the URL or some authentication header that lets you trigger an action of some sort. I first came across these with Github, but they’re pretty common now. Services will offer these as a way to get an action in one service to do something in another. A fairly common webhook for those getting started with these sorts of things is where creating a pull request (PR) on a Github repository will trigger a message on something like Slack to say the PR is there.

Well, that’s all well and good, but what does Matrix or Huginn have to do with things?

Matrix is a decentralized, end to end encrypted, eventually consistent database system, that just happens to be used extensively as a chat network. In particular, it’s used by Open Source projects, like KDE and Mozilla, and by Government bodies, like the whole French goverment (lead by DINSIC) the German Bundeswehr (Unified Armed Forces) department.

Matrix has a reference client, Element, that was previously called “Riot”, and in 2018 I produced a YouTube video showing how to bridge various alternative messaging systems into Matrix.

Huginn describes itself as:

Huginn is a system for building agents that perform automated tasks for you online. They can read the web, watch for events, and take actions on your behalf. Huginn’s Agents create and consume events, propagating them along a directed graph. Think of it as a hackable version of IFTTT or Zapier on your own server. You always know who has your data. You do.

Huginn Readme

With Huginn, I can create “agents”, including a “receive webhook agent” that will take the content I send, and tweak it to do something else. In the past I used IFTTT to do some fun things, like making this blog work, but now I use Huginn to post Tweets when I post to this blog.

So that I knew that Huginn was posting my twitter posts, I created a Matrix room called “Huginn Alerts” and used the Matrix account I created for the video I mentioned before to send me messages that it had made the posts I wanted. I followed the guidance from this page to do it: https://drwho.virtadpt.net/archive/2020-02-12/integrating-huginn-with-a-matrix-server/

Enough already. Just show me what you did.

In Element.io

  1. Get an access token for the Matrix account you want to post with.

Log into the web interface at https://app.element.io and go to your settings

Click where it says your handle, then click on where it says “All Settings”.

Then click on “Help & About” and scroll to the bottom of that page, where it says “Advanced”

Get to the “Advanced” part of the settings, under “Help & About” to get your access token.

Click where it says “Access Token: <click to reveal>” (strangely, I’m not posting that 😉)

  1. Click on the room, then click on it’s name at the top to open the settings, then click on “Advanced” to get the “Internal room ID”
Gettng the Room ID. Note, it starts with an exclamation mark (!) and ends :<servername>.

In Huginn

  1. Go to the “Credentials” tab, and click on “New Credential”. Give the credential a name (perhaps “Matrix Bot Access Token”), leave it as text and put your access token in here.
  1. Create a Credential for the Room ID. Like before, name it something sensible and put the ID you found earlier.
  1. Create a “Post Agent” by going to Agents and selecting “New agent”. This will show just the “Type” box. You can type in this box to put “Post Agent” and then find it. That will then provide you with the rest of these boxes. Provide a name, and tick the box marked “Propagate immediately”. I’ll cover the content of the “Options” box after this screenshot.

In the “Options” block is a button marked “Toggle View”. Select this which turns it from the above JSON pretty editor, into this text field (note your text is likely to be different):

My content of that box is as follows:

{
  "post_url": "https://matrix.org/_matrix/client/r0/rooms/{% credential Personal_Matrix_Notification_Channel %}/send/m.room.message?access_token={% credential Matrix_Bot_Access_Credential %}",
  "expected_receive_period_in_days": "365",
  "content_type": "json",
  "method": "post",
  "payload": {
    "msgtype": "m.text",
    "body": "{{ text }}"
  },
  "emit_events": "true",
  "no_merge": "false",
  "output_mode": "clean"
}

Note that the “post_url” value contains two “credential” values, like this:

{% credential Personal_Matrix_Notification_Channel %} (this is the Room ID we found earlier) and {% credential Matrix_Bot_Access_Credential %} (this is the Access Token we found earlier).

If you’ve used different names for these values (which are perfectly valid!) then just change these two. The part where it says “{{ text }}” leave there, because we’ll be using that in a later section. Click “Save” (the blue button at the bottom).

  1. Create a Webhook Agent. Go to Agents and then “New Agent”. Select “Webhook Agent” from the “Type” field. Give it a name, like “SSH Logged In Notification Agent”. Set “Keep Events” to a reasonable number of days, like 5. In “Receivers” find the Notification agent you created (“Send Matrix Message to Notification Room” was the name I used). Then, in the screenshot, I’ve pressed the “Toggle View” button on the “Options” section, as this is, to me a little clearer.

The content of the “options” box is:

{
  "secret": "supersecretstring",
  "expected_receive_period_in_days": 365,
  "payload_path": ".",
  "response": ""
}

Change the “secret” from “supersecretstring” to something a bit more useful and secure.

The “Expected Receive Period in Days” basically means, if you’ve not had an event cross this item in X number of days, does Huginn think this agent is broken? And the payload path of “.” basically means “pass everything to the next agent”.

Once you’ve completed this step, press “Save” which will take you back to your agents, and then go into the agent again. This will show you a page like this:

Copy that URL, because you’ll need it later…

On the server you are logging the SSH to

As root, create a file called /etc/ssh/sshrc. This file will be your script that will run every time someone logs in. It must have the file permissions 0644 (u+rw,g+r,o+r), which means that there is a slight risk that the Webhook secret is exposed.

The content of that file is as follows:

#!/bin/sh
ip="$(echo "$SSH_CONNECTION" | cut -d " " -f 1)"
curl --silent\
     --header "Content-Type: application/json"\
     --request POST\
     --data '{
       "At": "'"$(date -Is)"'",
       "Connection": "'"$SSH_CONNECTION"'",
       "User": "'"$USER"'",
       "Host": "'"$(hostname)"'",
       "Src": "'"$ip"'",
       "text": "'"$USER@$(hostname) logged in from $ip at $(date +%H:%M:%S)"'"
     }'\
     https://my.huginn.website/some/path/web_requests/taskid/secret

The heading line (#!/bin/sh) is more there for shellcheck, as, according to the SSH man page this is executed by /bin/sh either way.

The bulk of these values (At, Connection, User, Host or Src) are not actually used by Huginn, but might be useful for later… the key one is text, which if you recall from the “Send Matrix Message to Notification Room” Huginn agent, we put {{ text }} into the “options” block – that’s this block here!

So what happens when we log in over SSH?

SSH asks the shell in the user’s context to execute /etc/ssh/sshrc before it hands over to the user’s login session. This script calls curl and hands some POST data to the url.

Huginn receives this POST via the “SSH Logged In Notification Agent”, and files it.

Huginn then hands that off to the “Send Matrix Message to Notification Room”:

Huginn makes a POST to the Matrix.org server, and Matrix sends the finished message to all the attached clients.

Featured image is “Raven” by “Jim Bahn” on Flickr and is released under a CC-BY license.

"#security #lockpick" by "John Jones" on Flickr

Auto-starting an SSH Agent in Windows Subsystem for Linux

I tend to use Windows Subsystem for Linux (WSL) as a comprehensive SSH client, mostly for running things like Ansible scripts and Terraform. One of the issues I’ve had with it though is that, on a Linux GUI based system, I would start my SSH Agent on login, and then the first time I used an SSH key, I would unlock the key using the agent, and it would be cached for the duration of my logged in session.

While I was looking for something last night, I came across this solution on Stack Overflow (which in turn links to this blog post, which in turn links to this mailing list post) that suggests adding the following stanza to ~/.profile in WSL. I’m running the WSL version of Ubuntu 20.04, but the same principles apply on Cygwin, or, probably, any headless-server installation of a Linux distribution, if that’s your thing.

SSH_ENV="$HOME/.ssh/agent-environment"
function start_agent {
    echo "Initialising new SSH agent..."
    /usr/bin/ssh-agent | sed 's/^echo/#echo/' > "${SSH_ENV}"
    echo succeeded
    chmod 600 "${SSH_ENV}"
    . "${SSH_ENV}" > /dev/null
}
# Source SSH settings, if applicable
if [ -f "${SSH_ENV}" ]; then
    . "${SSH_ENV}" > /dev/null
    ps -ef | grep ${SSH_AGENT_PID} | grep ssh-agent$ > /dev/null || {
        start_agent;
    }
else
    start_agent;
fi

Now, this part is all well-and-good, but what about that part where you want to SSH using a key, and then that being unlocked for the duration of your SSH Agent being available?

To get around that, in the same solution page, there is a suggestion of adding this line to your .ssh/config: AddKeysToAgent yes. I’ve previously suggested using dynamically included SSH configuration files, so in this case, I’d look for your file which contains your “wildcard” stanza (if you have one), and add the line there. This is what mine looks like:

Host *
  AddKeysToAgent yes
  IdentityFile ~/.ssh/MyCurrentKey

How does this help you? Well, if you’re using jump hosts (using ProxyJump MyBastionHost, for example) you’ll only be prompted for your SSH Key once, or if you typically do a lot of SSH sessions, you’ll only need to unlock your session once.

BUT, and I can’t really stress this enough, don’t use this on a shared or suspected compromised system! If you’ve got a root account which can access the content of your Agent’s Socket and PID, then any protections that private key may have held for your system is compromised.

Featured image is “#security #lockpick” by “John Jones” on Flickr and is released under a CC-BY-ND license.

"Accept a New SSH Host Key" by "Linux Screenshots" on Flickr

Purposefully Reducing SSH Security when performing Builds of short-lived devices

I’ve recently been developing a few builds of things at home using throw-away sessions of virtual machines, and I found myself repeatedly having to accept and even having to remove SSH host keys for things I knew wouldn’t be around for long. It’s not a huge disaster, but it’s an annoyance.

This annoyance comes from the fact that SSH uses a thing called “Trust-On-First-Use” (Or TOFU) to protect yourself against a “Man-in-the-Middle” attack (or even where the host has been replaced with something malicious), which, for infrastructure that has a long lifetime (anything more than a couple of days) makes sense! You’re building something you want to trust hasn’t been compromised! That said, if you’re building new virtual machines, testing something and then rebuilding it to prove your script worked… well, that’s not so useful!

So, in this case, if you’ve got a designated build network, or if you trust, implicitly, your normal working network, this is a dead simple work-around.

In $HOME/.ssh/config or in $HOME/.ssh/config.d/local (if you’ve followed my previous advice to use separate ssh config files), add the following stanza:

# RFC1918
Host 10.* 172.16.* 172.17.* 172.18.* 172.19.* 172.20.* 172.21.* 172.22.* 172.23.* 172.24.* 172.25.* 172.26.* 172.27.* 172.28.* 172.29.* 172.30.* 172.31.* 192.168.*
        StrictHostKeyChecking no
        UserKnownHostsFile /dev/null

# RFC5373 and RFC2544
Host 192.0.2.* 198.51.100.* 203.0.113.* 198.18.* 198.19.*
        StrictHostKeyChecking no
        UserKnownHostsFile /dev/null

These stanzas let you disable host key checking for any IP address in the RFC1918 ranges (10.0.0.0/8, 172.16.0.0/12 and 192.168.0.0/16), and for the RFC5373 ranges (192.0.2.0/24, 198.51.100.0/24 and 203.0.113.0/24) – which should be used for documentation, and for the RFC2544 range (198.18.0.0/15) which should be used for inter-network testing.

Alternatively, if you always use a DDNS provider for short-lived assignments (for example, I use davd/docker-ddns) then instead, you can use this stanza:

Host *.ddns.example.com
        StrictHostKeyChecking no
        UserKnownHostsFile /dev/null

(Assuming, of course, you use ddns.example.com as your DDNS address!)

Featured image is “Accept a New SSH Host Key” by “Linux Screenshots” on Flickr and is released under a CC-BY license.

"Mesh Facade" by "Pedro Ângelo" on Flickr

Looking at the Nebula Overlay Meshed VPN Network from Slack

Around 2-3 years ago, Slack– the company who produces Slack the IM client, started working on a meshed overlay network product, called Nebula, to manage their environment. After two years of running their production network on the back of it, they decided to open source it. I found out about Nebula via a Medium Post that was mentioned in the HangOps Slack Group. I got interested in it, asked a few questions about Nebula in the Slack, and then in the Github Issues for it, and then recently raised a Pull Request to add more complete documentation than their single (heavily) commented config file.

So, let’s go into some details on why this is interesting to me.

1. Nebula uses a flat IPv4 network to identify all hosts in the network, no matter where in the network those hosts reside.

This means that I can address any host in my (self allocated) 198.18.0.0/16 network, and I don’t need to worry about routing tables, production/DR sites, network tromboneing and so on… it’s just… Flat.

[Note: Yes, that IP address “looks” like a public subnet – but it’s actually a testing network, allocated by IANA for network testing!]

2. Nebula has host-based firewalling built into the configuration file.

This means that once I know how my network should operate (yes, I know that’s a big ask), I can restrict my servers from being able to reach my laptops, or I can stop my web server from being able to talk to my database server, except for on the database ports. Lateral movement becomes a LOT harder.

This firewalling also looks a lot like (Network) Security Groups (for the AWS and Azure familiar), so you have a default “Deny” rule, and then layer “Allow” rules on top. You also have Inbound and Outbound rules, so if you want to stop your laptops from talking anything but DNS, SSH, HTTPS and ICMP over Nebula…. well, yep, you can do that :)

3. Nebula uses a PKI environment. Where you can have multiple Certificate Authorities.

This means that I have a central server running my certificate authority (CA), with a “backup” CA, stored offline – in case of dire disaster with my primary CA, and push both CA’s to all my nodes. If I suddenly need to replace all the certificates that my current CA signed, I can do that with minimal interruption to my nodes. Good stuff.

Nebula also created their own PKI attributes to identify the roles of each device in the Nebula environment. By signing that as part of the certificate on each node too, means your CA asserts that the role that certificate holds is valid for that node in the network.

Creating a node’s certificate is a simple command:

nebula-cert sign -name jon-laptop -ip 198.18.0.1/16 -groups admin,laptop,support

This certificate has the IP address of the node baked in (it’s 198.18.0.1) and the groups it’s part of (admin, laptop and support), as well as the host name (jon-laptop). I can use any of these three items in my firewall rules I mentioned above.

4. It creates a peer-to-peer, meshed VPN.

While it’s possible to create a peer-to-peer meshed VPN with commercial products, I’ve not seen any which are as light-weight to deploy as this. Each node finds all the other nodes in the network by using a collection of “Lighthouses” (similar to Torrent Seeds or Skype Super Nodes) which tells all the connecting nodes where all the other machines in the network are located. These then initiate UDP connections to the other nodes they want to talk to. If they are struggling (because of NAT or Double NAT), then there’s a NAT Punching process (called, humourously, “punchy”) which lets you signal via the Lighthouse that you’re trying to reach another node that can’t see your connection, and asks it to also connect out to you over UDP… thereby fixing the connection issue. All good.

5. Nebula has clients for Windows, Mac and Linux. Apparently there are clients for iOS in the works (meh, I’m not on Apple… but I know some are) and I’ve heard nothing about Android as yet, but as it’s on Linux, I’m sure some enterprising soul can take a look at it (the client is written in Go).

If you want to have a look at Nebula for your own testing, I’ve created a Terraform based environment on AWS and Azure to show how you’d manage it all using Ansible Tower, which builds:

2 VPCs (AWS) and 1 VNet (Azure)
6 subnets (3 public, 3 private)
1 public AWX (the upstream project from Ansible Tower) Server
1 private Nebula Certificate Authority
2 public Web Servers (one in AWS, one in Azure)
2 private Database Servers (one in AWS, one in Azure)
2 public Bastion Servers (one in AWS, one in Azure) – that lets AWX reach into the Private sections of the network, without exposing SSH from all the hosts.

If you don’t want to provision the Azure side, just remove load_web2_module.tf from the Terraform directory in that repo… Job’s a good’n!

I have plans to look at a couple of variables, like Nebula’s closest rival, ZeroTier, and to look at using SaltStack instead of Ansible, to reduce the need for the extra Bastion servers.

Featured image is “Mesh Facade” by “Pedro Ângelo” on Flickr and is released under a CC-BY-SA license.

“Swatch Water Store, Grand Central Station, NYC, 9/2016, pics by Mike Mozart of TheToyChannel and JeepersMedia on YouTube #Swatch #Watch” by “Mike Mozart” on Flickr

Time Based Security

I came across the concept of “Time Based Security” (TBS) in the Sysadministrivia podcast, S4E13.

I’m still digging into the details of it, but in essence, the “Armadillo” (Crunchy on the outside, soft on the inside) protection model is broken (sometimes known as the “Fortress Model”). You assume that your impenetrable network boundary will prevent attackers from getting to your sensitive data. While this may stop them for a while, what you’re actually seeing here is one part of a complex protection system, however many organisations miss the fact that this is just one part.

The examples used in the only online content I’ve found about this refer to a burglary.

In this context, your “Protection” (P) is measured in time. Perhaps you have hardened glass that takes 20 seconds to break.

Next, we evaluate “Detection” (D) which is also, surprisingly enough, measured in time. As the glass is hit, it triggers an alarm to a security facility. That takes 20 seconds to respond and goes to a dispatch centre, another 20 seconds for that to be answered and a police officer dispatched.

The police officer being dispatched is the “Response” (R). The police take (optimistically) 2 minutes to arrive (it was written in the 90’s so the police forces weren’t decimated then).

So, in the TBS system, we say that Detection (D) of 40 seconds plus Response (R) of 120 seconds = 160 seconds. This is greater than Protection (P) of 20 seconds, so we have an Exposure (E) time of 140 seconds E = P – (D + R). The question that is posed is, how much damage can be done in E?

So, compare this to your average pre-automation SOC. Your firewall, SIEM (Security Incident Event Management system), IDS (Intrusion Detection System) or WAF (Web Application Firewall) triggers an alarm. Someone is trying to do something (e.g. Denial Of Service attack, password spraying or port scanning for vulnerable services) a system you’re responsible for. While D might be in the tiny fractions of a minute (perhaps let’s say 1 minute, for maths sake), R is likely to be minutes or even hours, depending on the refresh rate of the ticket management system or alarm system (again, for maths sake, let’s say 60 minutes). So, D+R is now 61 minutes. How long is P really going to hold? Could it be less than 30 minutes against a determined attacker? (Let’s assume P is 30 minutes for maths sake).

Let’s do the calculation for a pre-automation SOC (Security Operations Centre). P-(D+R)=E. E here is 31 minutes. How much damage can an attacker do in 31 minutes? Could they put a backdoor into your system? Can they download sensitive data to a remote system? Could they pivot to your monitoring system, and remove the logs that said they were in there?

If you consider how much smaller the D and R numbers become with an event driven SOAR (Security Orchestration and Automation Response) system – does that improve your P and E numbers? Consider that if you can get E to 0, this could be considered to be “A Secure Environment”.

Also, consider the fact that many of the tools we implement for security reduce D and R, but if you’re not monitoring the outputs of the Detection components, then your response time grows significantly. If your Detection component is misconfigured in that it’s producing too many False Positives (for example, “The Boy Who Cried Wolf“), so you don’t see the real incident, then your Response might only be when a security service notifies you that your data, your service or your money has been exposed and lost. And that wouldn’t be good now… Time to look into automation 😁

Featured image is “Swatch Water Store, Grand Central Station, NYC, 9/2016, pics by Mike Mozart of TheToyChannel and JeepersMedia on YouTube #Swatch #Watch” by “Mike Mozart” on Flickr and is released under a CC-BY license.

"Hacker" by "The Preiser Project" on Flickr

What to do when a website you’ve trusted gets hacked?

Hello! Maybe you just got a sneeking suspicion that a website you trusted isn’t behaving right, perhaps someone told you that “unusual content” is being posted in your name somewhere, or, if you’re really lucky, you might have just had an email from a website like “HaveIBeenPwned.com” or “Firefox Monitor“. It might look something like this:

An example of an email from the service “Have I Been Pwned”

Of course, it doesn’t feel like you’re lucky! It’s OK. These things happen quite a lot of the time, and you’re not the only one in this boat!

How bad is it, Doc?

First of all, don’t panic! Get some idea of the scale of problem this is by looking at a few key things.

  1. How recent was the breach? Give this a score between 1 (right now) and 10 (more than 1 month ago).
  2. How many websites and services do you use this account on? Give this a score between 1 (right now) and 10 (OMG, this is *my* password, and I use it everywhere).
  3. How many other services would use this account to authenticate to, or get a password reset from? Give this a score between 1 (nope, it’s just this website. We’re good) and 10 (It’s my email account, and everything I’ve ever signed up to uses this account as the login address… or it’s Facebook/Google and I use their authentication to login to everything else).
  4. How much does your reputation hang on this website or any other websites that someone reusing the credentials of this account would get access to? Give this a score between 1 (meh, I post cat pictures from an anonymous username) and 10 (I’m an INFLUENCER HERE dagnamit! I get money because I said stuff here and/or my job is on that website, or I am publicly connected to my employer by virtue of that profile).
  5. (Optional) If this is from a breach notification, does it say that it’s just email addresses (score 1), or that it includes passwords (score 5), unencrypted or plaintext passwords (score 8) or full credit card details (score 10)?

Once you’ve got an idea of scale (4 to 40 or 5 to 50, depending on whether you used that last question), you’ve got an idea of how potentially bad it is.

Take action!

Make a list of the websites you think that you need to change this password on.

  1. Start with email accounts (GMail, Hotmail, Outlook, Yahoo, AOL and so on) – each email account that uses the same password needs to be changed, and this is because almost every website uses your email address to make a “password” change on it! (e.g. “Forgot your password, just type in your email address here, and we’ll send you a reset link“).
  2. Prominent social media profiles (e.g. Facebook, Twitter, Instagram) come next, even if they’re not linked to your persona. This is where your potential reputation damage comes from!
  3. Next up is *this* website, the one you got the breach notification for. After all, you know this password is “wild” now!

Change some passwords

This is a bit of a bind, but I’d REALLY recommend making a fresh password for each of those sites. There are several options for doing this, but my preferred option is to use a password manager. If you’re not very tech savvy, consider using the service Lastpass. If you’re tech savvy, and understand how to keep files in sync across multiple devices, you might be interested in using KeePassXC (my personal preference) or BitWarden instead.

No really. A fresh password. Per site. Honest. And not just “MyComplexPassw0rd-Hotmail” because there are ways of spotting you’ve done something like that, and when they come to your facebook account, they’ll try “MyComplexPassw0rd-Facebook” just to see if it gets them in.

ℹ️ Using a password manager gives you a unique, per-account password. I just generated a fresh one (for a dummy website), and it was 2-K$F+j7#Zz8b$A^]qj. But, fortunately, I don’t have to remember it. I can just copy and paste it in to the form when I need to change it, or perhaps, if you have a browser add-on, that’ll fill it in for you.

Making a list, and checking it twice!

Fab, so you’ve now got a lovely list of unique passwords. A bit like Santa, it’s time to check your list again. Assume that your list of sites you just changed passwords for were all compromised, because someone knew that password… I know, it’s a scary thought! So, have a look at all those websites you just changed the password on and figure out what they have links to, then you’ll probably make your list of things you need to change a bit bigger.

Not sure what they have links to?

Well, perhaps you’re looking at an email account… have a look through the emails you’ve received in the last month, three months or year and see how many of those come from “something” unique. Perhaps you signed up to a shopping site with that email address? It’s probably worth getting a password reset for that site done.

Perhaps you’re looking at a social media site that lets you login to other services? Check through those other services and make sure that “someone” hasn’t allowed access to a website they control. After all, you did lose access to that website, and so you don’t know what it’s connected to.

Also, check all of these sites, and make sure there aren’t any unexpected “active sessions” (where someone else is logged into your account still). If you have got any, kick them out :)

OK, so the horse bolted, now close the gate!

Once you’ve sorted out all of these passwords, it’s probably worth looking at improving your security in general. To do this, we need to think about how people get access to your account. As I wrote in my “What to do when your Facebook account gets hacked?” post:

What if you accidentally gave your password to someone? Or if you went to a website that wasn’t actually the right page and put your password in there by mistake? Falling prey to this when it’s done on purpose is known as social engineering or phishing, and means that someone else has your password to get into your account.

A quote from the “Second Factor” part of “What to do when your Facebook account gets hacked?

The easiest way of locking this down is to use a “Second Factor” (sometimes abbreviated to 2FA). You need to give your password (“something you know”) to log into the website. Now you also need something separate, that isn’t in the same store. If this were a physical token (like a SoloKey, Yubikey, or a RSA SecurID token), it’d be “something you have” (after all, you need to carry around that “token” with you), but normally these days it’s something on your phone.

Some places will send you a text message, others will pop up an “approve login” screen (and, I should note, if you get one and YOU AREN’T LOGGING IN, don’t press “approve”!), or you might have a separate app (perhaps called “Google Authenticator”, “Authy” or something like “Duo Security”) that has a number that keeps changing.

You should then finish your login with a code from that app, SMS or token or reacting to that screen or perhaps even pressing a button on a thing you plug into your computer. If you want to know how to set this up, take a look at “TwoFactorAuth.org“, a website providing access to the documentation on setting up 2FA on many of the websites you currently use… but especially do this with your email accounts.

Featured image is “Hacker” by “The Preiser Project” on Flickr and is released under a CC-BY license.

JonTheNiceGuy and "The Chief" Peter Bleksley at BSides Liverpool 2019

Review of BSIDES Liverpool 2019

I had the privilege today to attend BSIDES Liverpool 2019. BSIDES is a infosec community conference. The majority of the talks were recorded, and I can strongly recommend making your way through the content when it becomes available.

Full disclosure: While my employer is a sponsor, I was not there to represent the company, I was just enjoying the show. A former colleague (good friend and, while he was still employed by Fujitsu, an FDE – so I think he still is one) is one of the organisers team.

The first talk I saw (aside from the welcome speech) was the keynote by Omri Segev Moyal (@gelossnake) about how to use serverless technologies (like AWS Lambda) to build a malware research platform. The key takeaway I have from that talk was how easy it is to build a simple python lambda script using Chalice. That was fantastic, and I’m looking forward to trying some things with that service!

For various reasons (mostly because I got talking to people), I missed the rest of the morning tracks except for the last talk before lunch. I heard great things about the Career Advice talk by Martin King, and the Social Engineering talk by Tom H, but will need to catch up on those on the videos released after.

Just before lunch we received a talk from “The Chief” (from the Channel 4 TV Series “Hunted”), Peter Bleksley, about an investigation he’s currently involved in. This was quite an intense session, and his history (the first 1/4 of his talk) was very interesting. Just before he went in for his talk, I got a selfie with him (which is the “Featured Image” for this post :) )

After lunch, I sat on the Rookies Track, and saw three fantastic talks, from Chrissi Robertson (@frootware) on Imposter Syndrome, Matt (@reversetor) on “Privacy in the age of Convenience” (reminding me of one of my very early talks at OggCamp/BarCamp Manchester) and Jan (@janfajfer) about detecting data leaks on mobile devices with EVPN. All three speakers were fab and nailed their content.

Next up was an unrecorded talk by Jamie (@2sec4u) about WannaCry, as he was part of the company who discovered the “Kill-Switch” domain. He gave a very detailed overview of the timeline about WannaCry, the current situation of the kill-switch, and a view on some of the data from infected-but-dormant machines which are still trying to reach the kill-switch. A very scary but well explained talk. Also, memes and rude words, but it’s clearly a subject that needed some levity, being part of a frankly rubbish set of circumstances.

After that was a talk from (two-out-of-six of) The Beer Farmers. This was a talk (mostly) about privacy and the lack of it from the social media systems of Facebook, Twitter and Google. As I listen to The Many Hats Club podcast, on which the Beer Farmers occasionally appear, it was a great experience matching faces to voices.

We finished the day on a talk by Finux (@f1nux) about Machiavelli as his writings (in the form of “The Prince”) would apply to Infosec. I was tempted to take a whole slew of photos of the slide deck, but figured I’d just wait for the video to be released, as it would, I’m sure, make more sense in context.

There was a closing talk, and then everyone retired to the bar. All in all, a great day, and I’m really glad I got the opportunity to go (thanks for your ticket Paul (@s7v7ns) – you missed out mate!)

"Matrix" by "Paul Downey" on Flickr

Idempotent Dynamic Content in Ansible

One of my colleagues recently sent out a link to a post about generating idempotent random numbers for Ansible. As I was reading it, I realised that there are other ways of doing the same thing (but not quite as pretty).

See, one of the things I (mis-)use Ansible for is to build Azure, AWS and OpenStack environments (instead of, perhaps, using Terraform, Cloud Formations or Heat Stacks). As a result, I frequently want to set complex passwords that are unique to *that environment* but that aren’t new for each build. My way of doing this is to run a delegated task to generate files in host_vars. Here’s a version of the playbook I use for that!

In the same gist as that block has been sourced from I have some example output from “20 hosts” – one of which has a pre-defined password in the inventory, and the rest of which are generated by the script.

I hope this is useful to someone!

Late Edit – 2019-05-19: Encrypting the values you generate

Following this post, a friend of mine – Jeremy mentioned on Linked In that I should have a look at Ansible Vault. Well, *ideally*, yes, however, when I looked at this code, I couldn’t work out a way of forcing the session to run Vault against a value I’ve just created, short of running something a raw or a shell module like “ansible-vault encrypt {{ file_containing_password }}“. Realistically, if you’re doing a lot with these passwords, you should probably use an external password vault, such as HashiCorp’s Vault or PasswordStore.org’s Pass. Neither of which I tend to use, because it’s just not part of my life yet – but I’ve heard good things about both!

Featured image is “Matrix” by “Paul Downey” on Flickr and is released under a CC-BY license.