Getting started with Unit Testing for PHP

Unit testing seems like a bit of a dark art when you’re first introduced to it. “Create this new file. Tell it what is supposed to be the result when you run a test, and it’ll tell you if you’re right nor not.”

Let’s start with a pseudocode example:

test->assertTrue(1+1 = 2); // Test returns true, huzzah!
test->assertFalse(1+1 = 3); // Test returns false. Those integers must not have been large enough

I want to use PHPUnit, and for me the easiest way to get this and the rest of the tools I’ll be referring to in this collection of posts is to install “The PHP Quality Assurance Toolchain“. On my Ubuntu install, this was done as follows:

sudo pear upgrade PEAR
sudo pear config-set auto_discover 1
sudo pear install --all-deps pear.phpqatools.org/phpqatools

Now we’ve got the tools in place, let’s set up the directory structure.

/
+ -- Classes
|    + -- Config.php
+ -- Tests
     + -- ConfigTest.php

In here, you see we’ve created two files, one contains the class we want to use, and the other contains the tests we will be running.

So, let’s slap on the veneer of coating that these two files need to be valid to test.

/Classes/Config.php

<?php
class Config
{
}

/Tests/Config.php

<?php

include dirname(__FILE__) . '/../Classes/Config.php';

class ConfigTest extends PHPUnit_Framework_TestCase
{
}

So, just to summarise, here we have two, essentially empty classes.

Let’s put some code into the test file.

<?php

include dirname(__FILE__) . '/../Classes/Config.php';

class ConfigTest extends PHPUnit_Framework_TestCase
{
  public function testCreateObject()
  {
    $config = new Config();
    $this->assertTrue(is_object($config));
  }
}

We can now run this test from the command line as follows:

phpunit Tests/ConfigTest.php

phpunit Tests/01_ConfigTest.php
PHPUnit 3.6.10 by Sebastian Bergmann.

.

Time: 1 second, Memory: 3.00Mb

OK (1 test, 1 assertion)

That was nice and straightforward!

Let’s add some more code!

In ConfigTest, let’s tell it to load some configuration, using a config file.

<?php

include dirname(__FILE__) . '/../Classes/Config.php';

class ConfigTest extends PHPUnit_Framework_TestCase
{
  public function testCreateObject()
  {
    $config = new Config();
    $this->assertTrue(is_object($config));
  }

  public function testLoadConfig()
  {
    $config = new Config();
    $config->load();
  }
}

And now when we run it?

PHP Fatal error:  Call to undefined method Config::load() in /var/www/PhpBetterPractices/Tests/ConfigTest.php on line 16

Ah, perhaps we need to write some code into /Classes/Config.php

<?php
class Config
{
  public function load()
  {
    include dirname(__FILE__) . '/../Config/default_config.php';
  }
}

But, running this, again, we get an error message!

PHPUnit 3.6.10 by Sebastian Bergmann.

.E

Time: 0 seconds, Memory: 3.00Mb

There was 1 error:

1) ConfigTest::testLoadConfig
include(/var/www/PhpBetterPractices/Config/default_config.php): failed to open stream: No such file or directory

/var/www/PhpBetterPractices/Classes/Config.php:7
/var/www/PhpBetterPractices/Classes/Config.php:7
/var/www/PhpBetterPractices/Tests/ConfigTest.php:16

FAILURES!
Tests: 2, Assertions: 1, Errors: 1.

So, we actually need to check that the file exists first, perhaps we should throw an error if it doesn’t? We could also pass the name of the config file to pass to the script, which would let us test more and different configuration options, should we need them.

class Config
{
    public function load($file = null)
    {
        if ($file == null) {
            $file = 'default.config.php';
        }

        $filename = dirname(__FILE__) . '/../Config/' . $file;

        if (file_exists($filename)) {
            include $filename;
        } else {
            throw new InvalidArgumentException("File not found");
        }
    }
}

So, here’s the new UnitTest code:

class ConfigTest extends PHPUnit_Framework_TestCase
{
    public function testCreateObject()
    {
        $config = new Config();
        $this->assertTrue(is_object($config));
    }

    public function testLoadConfig()
    {
        $config = new Config();
        $config->load();
    }

    /**
     * @expectedException InvalidArgumentException
     */
    public function testFailLoadingConfig()
    {
        $config = new Config();
        @$config->load('A file which does not exist');
    }
}

This assumes the file /Config/default.config.php exists, albeit as an empty file.

So, let’s run those tests and see what happens?

PHPUnit 3.6.10 by Sebastian Bergmann.

...

Time: 0 seconds, Memory: 3.25Mb

OK (3 tests, 2 assertions)

Huzzah! That’s looking good. Notice that to handle a test of something which should throw an exception, you can either wrapper the function in a try/catch loop and, in the try side of the loop, have $this->assertTrue(false) to prevent false positives and in the catch side, do your $this->assertBlah() on the exception. Alternatively, (and much more simplely), use a documentation notation of @expectedException NameOfException and then prefix the function you are testing with the @ symbol. This is how I did it with the test “testFailLoadingConfig()”.

This obviously doesn’t handle setting and getting configuration values, so let’s add those.

Here’s the additions to the Config.php file:

    public function set($key = null, $value = null)
    {
        if ($key == null) {
            throw new BadFunctionCallException("Key not set");
        }
        if ($value == null) {
            unset ($this->arrValues[$key]);
            return true;
        } else {
            $this->arrValues[$key] = $value;
            return true;
        }
    }

    public function get($key = null)
    {
        if ($key == null) {
            throw new BadFunctionCallException("Key not set");
        }
        if (isset($this->arrValues[$key])) {
            return $this->arrValues[$key];
        } else {
            return null;
        }
    }

And the default.config.php file:

<?php
$this->set('demo', true);

And lastly, the changes to the ConfigTest.php file:

    public function testLoadConfig()
    {
        $config = new Config();
        $this->assertTrue(is_object($config));
        $config->load('default.config.php');
        $this->assertTrue($config->get('demo'));
    }

    /**
     * @expectedException BadFunctionCallException
     */
    public function testFailSettingValue()
    {
        $config = new Config();
        @$config->set();
    }

    /**
     * @expectedException BadFunctionCallException
     */
    public function testFailGettingValue()
    {
        $config = new Config();
        @$config->get();
    }

We’ve not actually finished testing this yet. Not sure how I can tell?

phpunit --coverage-text Tests/ConfigTest.php
PHPUnit 3.6.10 by Sebastian Bergmann.

....

Time: 0 seconds, Memory: 3.75Mb

OK (4 tests, 5 assertions)

Generating textual code coverage report, this may take a moment.

Code Coverage Report
  2012-05-08 18:54:16

 Summary:
  Classes: 0.00% (0/1)
  Methods: 0.00% (0/3)
  Lines:   76.19% (16/21)

@Config::Config
  Methods: 100.00% ( 3/ 3)   Lines:  76.19% ( 16/ 21)

Notice that there are 5 lines outstanding – probably around the unsetting values and using default values. If you use an IDE (like NetBeans) you can actually get the editor to show you, using coloured lines, exactly which lines you’ve not yet tested! Nice.

So, the last thing to talk about is Containers and Dependency Injection. We’ve already started with the Dependency Injection here – that $config->load(‘filename’); function handles loading config files, or you could just bypass that with $config->set(‘key’, ‘value); but once you get past a file or two, you might just end up with a lot of redundant re-loading of config files, or worse, lots of database connections open.

So, this is where Containers come in (something I horrifically failed to understand before).

Here’s a container:

class ConfigContainer
{
  protected static $config = null;

  public static function Load()
  {
    if (self::$config == null) {
      self::$config = new Config();
      self::$config->load();
    }
    return self::$Config;
  }
}

It’s purpose (in this case) is to load the config class, including any dependencies that you may need for that class, and then return that class to you. You could conceivably create a Database container, or a Request container or a User container with very little extra work, and with a few short calls, have a single function for each of your regular and routine sources of processing data, but without preventing you from being able to easily and repeatably test that data – by not going through the container.

Of course, there’s nothing to stop you just having these created in a registry class, or store them in a global from the get-go, but, I am calling these “Better Practices” after all, and these are considered to be not-so-good-practices.

Just as a note, code from this section can be seen at GitHub, if you want to use them at all.

Update 2012-05-11: Added detail to the try/catch exception catching as per frimkron’s comment. Thanks!

php-PDO-MySQL versus php-mysql

PHP and MySQL was the duo that took dynamic web development from /cgi/formmail.cgi to… well, where we are today for me. My first proper experience of working with a web application, as opposed to a static web site with a few perl scripts bolted on was PHPNuke, and I’ve not looked back since.

Almost the first lines of PHP you learn to code are

<?php
mysql_connect("localhost", "root", "");
mysql_use_db("mysite");

Followed closely by mysql_query(), then mysql_fetch_array(), mysl_num_rows(), and if the rest of us are lucky, mysql_real_escape_string().

So you see, for me, it’s really weird to find that even though this is what we’re teaching everyone to use, this weird incantation where we have to wrap every query up in a cotton wool ball of SQL injection protection… there’s an easier way where you don’t have to wrapper everything, which, even better, is more efficient at your SQL server and, if you’ve made the transition to using Classes in your code, will give you an object per-result rather than an array.

Let’s look at some examples of code, and see how we would transition things.

Here’s some rather unoriginal code to get the contents of a user table.

<?php
$db = mysql_connect("localhost", "root", "");
mysql_use_db("mysite");
$sql = "select userid, realname, lastloggedin from users "
     . "where username = '" . mysql_real_escape_string($_POST['username'])
     . "' and password = '" . mysql_real_escape_string($_POST['password']) . "'";
$query = mysql_query($sql);
$data = false;
if (mysql_num_rows($query) == 1) {
  $data = mysql_fetch_array($query);
  echo "Hello {$data['realname']}, your userid is {$data['userid']} and "
     . "you last logged in at " . date("H:i:s", strtotime($data['lastloggedin']))
     . " on " . date("d-M-Y" , strtotime($data['lastloggedin']));
}

There, now there’s not much wrong with that (ignoring the security hole of having your passwords in cleartext) is there? All pretty readable, right? Well… what happens if your database is down, how do you handle that in some kind of sensible way? You need to wrap some code around the mysql_connect to show it actually connected. What about if someone dropped the database by mistake, or you connected to the wrong database server and that table wasn’t there, now you’re wrappering the use_db line too. Did you make a typo in the SQL? Are you sure you didn’t miss a mysql_real_escape_string in there somewhere…

Here’s the same thing in PDO, protected against all (again, except the cleartext password) of those things.

<?php
try {
  $db = new PDO("mysql:host=localhost;dbname=mysite", "root", "");
  $db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
  $sql = "select userid, realname, lastloggedin from users where username = ? and password = ?";
  $query = $db->prepare($sql);
  $query->execute(array($_POST['username'], $_POST['password']));
  $data = $query->fetch();
  if ($data != false) {
    echo "Hello {$data['realname']}, your userid is {$data['userid']} and "
       . "you last logged in at " . date("H:i:s", strtotime($data['lastloggedin']))
       . " on " . date("d-M-Y" , strtotime($data['lastloggedin']));
  }
} catch (PDOException $e) {
  error_log("User unable to login: " . $e->getMessage());
}

So, let’s look at what transforms this into something much more interesting: $query->fetchObject();

<?php
try {
  $db = new PDO("mysql:host=localhost;dbname=mysite", "root", "");
  $db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
  $sql = "select userid, realname, lastloggedin, username, password " 
       . "from users where username = ? and password = ?";
  $query = $db->prepare($sql);
  $query->execute(array($_POST['username'], $_POST['password']));
  $data = $query->fetchObject('user');
  if ($data != false) {
    echo $data;
  }
} catch (PDOException $e) {
  error_log("User unable to login: " . $e->getMessage());
}

class user
{
  // Columns from the database
  protected $userid = null;
  protected $realname = null;
  protected $lastloggedin = null;
  protected $username = null;
  protected $password = null;
  // Processed Data
  protected $transformed_lastloggedin = null

  public function __construct()
  {
    if ($this->lastloggedin != null) {
      $this->transformed_lastloggedin = date("H:i:s", strtotime($this->lastloggedin)) 
                             . " on " . date("d-M-Y" , strtotime($this->lastloggedin);
    }
  }

  public function toString()
  {
    echo "Hello {$this->realname}, your userid is {$this->userid} and " 
       . "you last logged in at {$this->transformed_lastloggedin}";
  }
}

I’m starting to stretch the analogy here a little, but the important things here are:

  1. The __construct() function of the class is run after the data is inserted into it, meaning you can post-process any data you receive from the database without needing any new functions. Arguably, you could make the construct function accept the same values you’d receive from the database, and process those in the same way.
  2. The toString() function means you can mangle the content you want to output, without having to apparently run another function on the class you’ve just created.
  3. By using classes you get to do wonderful things like Unit Testing (this is a topic to follow), have automatic code documentation (err… to follow), reduce code duplication by extending classes and check on that, using “Copy and Paste Detection”.

Using Amazon an EC2 instance as an off-site CronJob

I run the CCHits.net website, and part of the day-to-day tasks that running the site entailed was the daily show creation which involved creating some text-to-speech audio for insertion into the podcasts. As I run the website on shared hosting, to which I didn’t have full access to the OS, I couldn’t just install Festival on the platform, and for whatever reason (I never did figure out what went wrong), I couldn’t build Festival to run on the shared host either.

Until “The Big Rewrite” (the capitals are totally worth it), I’d been doing the text-to-speech on my home server, but frankly, I’m on DSL, which meant I needed to set up Dynamic DNS, I had to be sure the server was always up (and it wasn’t!), etc, etc, etc. While I was looking into why I couldn’t get Festival to build, someone said “Well, why not just use EC2 to do it”.

After nearly a year of faffing about trying to make the …….. stupid thing work (as is testified by the draft in this very blog called “How I built my Audio Stack for CCHits”), I finally decided to spin up an EC2 instance for just this one task.

Now, I’m not the greenest guy on the block – hell, I drive 45 minutes into work each day, but I figured, why keep an EC2 instance running all the time, when I only need it for less than 20 minutes each day, so I did some reading, and found a post on making EC2 do the hard work for you, using the scalable computing APIs, but frankly, all I actually needed was the code to make it spin up, run the task and shut down again, especially as with using his methods, I’d have needed to either create an AMI image, or download the festival voice files each time… at around 100Mb. Not good. I ended up using the stuff I did know (bash scripting, cron tasks) and ditching the stuff I didn’t (AMI files, scalable computing API). I may revisit this later to do it the way he said instead. We’ll see :)

So, here’s the crack.

Create your EC2 image. It doesn’t need to do anything fancy yet – just boot up and keep running. You’ll do some tweaks later. Make a note of the instance number – it’ll probably start i- and then 8 or so hexedecimal digits, like this: i-12345678.

On your shared web host, download the EC2 API tools. According to this page, the API tools are available from here. The first link of those two is unlikely to change for a while, the second, maybe more so. You’ll need to make sure you have Java installed on that web host.

Once you’ve got the tools, you’ll need to create an X.509 certificate and key pair. See this page to find instructions. It was pretty straightforward.

So, you now have in, let’s say for sake of argument, your home directory:

  • /home/user/ec2-api-tools-x.x.x.x # The directory containing all the EC2 API tools
  • /home/user/ec2-keys/pk-{32[A-Za-z1-0]}.pem
  • /home/user/ec2-keys/cert-{32[A-Za-z1-0]}.pem

Also, you have java in /usr/bin/java.

Create the following script also in /home/user/ec2-api-tools-x.x.x.x – I called it ec2-wrapper.

#!/bin/bash
export EC2_HOME=/home/user/ec2-api-tools-x.x.x.x
export JAVA_HOME=/usr
export EC2_KEY=/home/user/ec2-keys/pk-{32[A-Za-z1-0]}.pem
export EC2_CERT=/home/user/ec2-keys/cert-{32[A-Za-z1-0]}.pem
${EC2_HOME}/bin/$* -K ${EC2_KEY} -C ${EC2_CERT

Obviously, you should change your paths to match what you have. What this script does is to add the X.509 certs to every EC2 request, plus adds the appropriate java and EC2_HOME paths to the script before running it.

I set up a CRON job (using crontab -e) to schedule the regular startup of the instance. Here’s the entry from my crontab:

#M   H  DoM Mth DoW  Command (Regular Crontab columns)
30   1   *   *   *   /home/user/ec2-api-tools-x.x.x.x/ec2-wrapper ec2-start-instances i-12345678
30   2   *   *   *   /home/user/ec2-api-tools-x.x.x.x/ec2-wrapper ec2-stop-instances i-12345678

So, this runs the start task at 30 minutes past 1am, local server time, and the stop task at 30 minutes past 2am. The second one there is just to be on the safe side, as we’ll try to shut down the box once it’s finished processing anyway. This way, the maximum time you’ll be billed for is 1 hour of time each day.

I then logged into my EC2 machine, and created, then tweaked the script from the earlier blog post (the scalable computing one).

#!/bin/bash -x
exec > >(tee /var/log/user-data.log|logger -t user-data -s 2>/dev/console) 2>&1
#
# This code is licensed under an Apache License - for the original 
# code and license, please see the footer of this script.
#
# !!!IMPORTANT!!!
# Edit this file and change this next line to your own email address:
#

EMAIL=user@example.com

# Get some information about the running instance
instance_id=$(wget -qO- instance-data/latest/meta-data/instance-id)
public_hostname=$(wget -qO- instance-data/latest/meta-data/public-hostname)

# Wait 5 minutes in case we want to get in to stop this from doing stuff
Sleep 300

if [ -f /home/ubuntu/donotrun ]
then
  exit 0
fi

# Send status email
/usr/sbin/sendmail -oi -t -f $EMAIL <<EOM
From: $EMAIL
To: $EMAIL
Subject: Running EC2 Scripts

== Making sure everything is up to date ==

`cd /home/ubuntu/website-rewrite && su -c "git pull" ubuntu 2>&1`

== Running the scheduled task ==

`php /home/ubuntu/website-rewrite/scheduled_task.php`

== Notes ==

This email was generated on the EC2 instance: $instance_id

If the instance is still running, you can monitor the output of this
job using a command like:

  ssh ubuntu@$public_hostname tail -1000f /var/log/user-data.log

EOM

# Give the email some time to be queued and delivered
sleep 300 # 5 minutes

if [ -f /home/ubuntu/shutdownwhendone ]
then
  shutdown -h now
fi

exit 0

########################################################################
# For more information about the original version of this code see:
#   http://alestic.com/2011/11/ec2-schedule-instance
# The original code and its license are available on github:
#   https://github.com/alestic/demo-ec2-schedule-instance
########################################################################

So, with that, I get a script which runs on schedule, on an EC2 platform, I get a confirmation e-mail it’s running. It shuts itself down, and hopefully, keeps on trucking :)

CampFireManager Workshop

NOTE: Images have been removed from this post 2017-05-02

What is CampFireManager?

CampFireManager is a tool to schedule talks at a Conference or Unconference (such as a barcamp). It is written in PHP and uses a MySQL backend to store the data about the event. It optionally can use SMS messaging and Microblogging services (like identi.ca or Twitter) to perform routine tasks on the system.

A modified version (CampFireManager-Lite) of CampFireManager was recently used at OggCamp ’10, in combination with tools created by Xibo. Here’s a look at what you would have seen if you were there:

CampFireManager-Lite

This is the main timetable. Note, it only shows two talks before “now”, the talk on “now” and the next 6 slot’s worth of talks. This caused us problems on the Saturday afternoon when we tried to display the timetable for the next day. The only way we could do that was to change the system time on the server.

This is the now and next screen for the main stage, and a screen showing similar information for each of the stages. By default, each screen shows, in turn, the timetable, the “now and next” screen for all stages and then the “now and next” screen for each individual screen. This is entered by the database, but should *really* have been configured at the screen.

Administration was performed by organisers. Data was collected on sheets, and then entered into the system. Using an in-built version of the timetable, these were placed into the appropriate slots in the timetable, which then updated all the other screens.

This, however, isn’t how I originally envisaged CampFireManager. The full version of CampFireManager is designed to operate with minimal administrative overhead from on-site staff.

CampFireManager

Users (as well as support staff and administrators) would log into the website using OpenID. Several common providers are pre-populated and available using icons to select them.

On logging in, all users initially see a timetable, with no definitions of rooms, but with the complete timetable for this day. At the top, under the “Slot” times, is a hyperlink showing “New Talk”. Clicking on this brings you to a page where you can enter your talk proposal. Clicking on “Go” inserts that talk into your timetable.

If you wanted to enter some contact details for your talk (so that your peers at the event can reach you), this is done by clicking on “Amend Contact Details”, and entering your details. This will retrospectively correct your details on all the talks you’re giving.

By clicking on “Add other access method” – you can add an “authorization code” (which is a 5 character string of case sensitive, non-similarly rendered letters and numbers) to allow you to update your talks, your contact details or the talks you’ll be attending from other communication methods – SMS or XMPP being the two primary examples.

However, I realised that I could use this access method logic to allow administrators the ability to grant administrative access to other organisers, or to promote staff to “Support” staff. In the same way you’d enter a 5 character string to give the same rights to your phone as your web access, you could enter a 9 character string (again, non-similarly rendered, case sensitive letters and numbers) to become an administrator or support staff member. Here you see an Administrator’s view of the timetable, which adds two extra options at the top box – Provide support to attendees and modify config values.

So, let’s support an attendee. We enter some details, either their Auth Code (if they know it), or some details about them they entered previously – a blog address, e-mail, some other contact method, or in extreme cases, their name. If we’ve had to use any form of wildcard in the search (that is, I typed “Jon” into the name box), then we get prompted to confirm it’s them, and can give them their unique Auth Code. If they’ve not registered on the site, then click on Create New Auth Code to create a new user. In either case, we get a page showing “This is with an AuthCode: ". From here we can amend their contact details, propose a new talk on their behalf, or if there's a talk in the future, show their attendance (which, to be fair, here I couldn't do, as I'd not created a talk for them to attend).

Here’s the administrative screen, where you see options that have configured the screens and access methods.

There’s a mobile interface to the site, which is quite similar to the support staff’s view of the site, and again it uses OpenID to authenticate. It gives the user access to a slimmed down portal – showing all talks which are yet to occur (allowing the user to show or remove their attendance), just this user’s talks yet to occur, and allowing the user to add a new talk and amend their contact details.

That is all the interactive screens shown, so here’s the two most regularly seen, non-interactive screens, both with a scrolling box between the event title and the content saying (in this instance):

Phones: 07 88 24 64 91 8 on the Three Network (with 18% signal)
Website: https://home.north.sprig.gs/CampFireManager/
Mobile site: https://home.north.sprig.gs/CampFireManager/m/
Event Hashtag: #oc10
Identify with this service by sending
I <your name> [email:your@email.address] [http://your.web.site]
(there are more options for identification by going to the website)
Propose a talk by sending P <Time Slot> <Slots Used> <Talk Title>
Cancel a talk by sending C <Talk Number> <Time Slot> [Reason]
Rename a talk by sending E <Talk Number> <Time Slot> <New Talk Title>
Attend a talk by sending A <Talk Number>
Decline the attendance of a talk by sending R <Talk Number>
Note: You can combine multiple A and R commands in one message.
Statements surrounded with <> are mandatory options, those statements surrounded with [] are optional.
These commands should be sent to your preferred mobile service listed above.

The Timetable:

And the “Direction” screens (like the now and next screens on the CampFireManager-Lite):

The above URLs do work, and the system is available for experimentation. The code and ticket tracker (if you want to run your own, local, instance) is at http://code.google.com/p/campfiremanager. Please contact me if you want to get access to my instance of CampFireManager from an administrative or support perspective. Please note, the SMS engine is not running.