php-PDO-MySQL versus php-mysql

PHP and MySQL was the duo that took dynamic web development from /cgi/formmail.cgi to… well, where we are today for me. My first proper experience of working with a web application, as opposed to a static web site with a few perl scripts bolted on was PHPNuke, and I’ve not looked back since.

Almost the first lines of PHP you learn to code are

<?php
mysql_connect("localhost", "root", "");
mysql_use_db("mysite");

Followed closely by mysql_query(), then mysql_fetch_array(), mysl_num_rows(), and if the rest of us are lucky, mysql_real_escape_string().

So you see, for me, it’s really weird to find that even though this is what we’re teaching everyone to use, this weird incantation where we have to wrap every query up in a cotton wool ball of SQL injection protection… there’s an easier way where you don’t have to wrapper everything, which, even better, is more efficient at your SQL server and, if you’ve made the transition to using Classes in your code, will give you an object per-result rather than an array.

Let’s look at some examples of code, and see how we would transition things.

Here’s some rather unoriginal code to get the contents of a user table.

<?php
$db = mysql_connect("localhost", "root", "");
mysql_use_db("mysite");
$sql = "select userid, realname, lastloggedin from users "
     . "where username = '" . mysql_real_escape_string($_POST['username'])
     . "' and password = '" . mysql_real_escape_string($_POST['password']) . "'";
$query = mysql_query($sql);
$data = false;
if (mysql_num_rows($query) == 1) {
  $data = mysql_fetch_array($query);
  echo "Hello {$data['realname']}, your userid is {$data['userid']} and "
     . "you last logged in at " . date("H:i:s", strtotime($data['lastloggedin']))
     . " on " . date("d-M-Y" , strtotime($data['lastloggedin']));
}

There, now there’s not much wrong with that (ignoring the security hole of having your passwords in cleartext) is there? All pretty readable, right? Well… what happens if your database is down, how do you handle that in some kind of sensible way? You need to wrap some code around the mysql_connect to show it actually connected. What about if someone dropped the database by mistake, or you connected to the wrong database server and that table wasn’t there, now you’re wrappering the use_db line too. Did you make a typo in the SQL? Are you sure you didn’t miss a mysql_real_escape_string in there somewhere…

Here’s the same thing in PDO, protected against all (again, except the cleartext password) of those things.

<?php
try {
  $db = new PDO("mysql:host=localhost;dbname=mysite", "root", "");
  $db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
  $sql = "select userid, realname, lastloggedin from users where username = ? and password = ?";
  $query = $db->prepare($sql);
  $query->execute(array($_POST['username'], $_POST['password']));
  $data = $query->fetch();
  if ($data != false) {
    echo "Hello {$data['realname']}, your userid is {$data['userid']} and "
       . "you last logged in at " . date("H:i:s", strtotime($data['lastloggedin']))
       . " on " . date("d-M-Y" , strtotime($data['lastloggedin']));
  }
} catch (PDOException $e) {
  error_log("User unable to login: " . $e->getMessage());
}

So, let’s look at what transforms this into something much more interesting: $query->fetchObject();

<?php
try {
  $db = new PDO("mysql:host=localhost;dbname=mysite", "root", "");
  $db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
  $sql = "select userid, realname, lastloggedin, username, password " 
       . "from users where username = ? and password = ?";
  $query = $db->prepare($sql);
  $query->execute(array($_POST['username'], $_POST['password']));
  $data = $query->fetchObject('user');
  if ($data != false) {
    echo $data;
  }
} catch (PDOException $e) {
  error_log("User unable to login: " . $e->getMessage());
}

class user
{
  // Columns from the database
  protected $userid = null;
  protected $realname = null;
  protected $lastloggedin = null;
  protected $username = null;
  protected $password = null;
  // Processed Data
  protected $transformed_lastloggedin = null

  public function __construct()
  {
    if ($this->lastloggedin != null) {
      $this->transformed_lastloggedin = date("H:i:s", strtotime($this->lastloggedin)) 
                             . " on " . date("d-M-Y" , strtotime($this->lastloggedin);
    }
  }

  public function toString()
  {
    echo "Hello {$this->realname}, your userid is {$this->userid} and " 
       . "you last logged in at {$this->transformed_lastloggedin}";
  }
}

I’m starting to stretch the analogy here a little, but the important things here are:

  1. The __construct() function of the class is run after the data is inserted into it, meaning you can post-process any data you receive from the database without needing any new functions. Arguably, you could make the construct function accept the same values you’d receive from the database, and process those in the same way.
  2. The toString() function means you can mangle the content you want to output, without having to apparently run another function on the class you’ve just created.
  3. By using classes you get to do wonderful things like Unit Testing (this is a topic to follow), have automatic code documentation (err… to follow), reduce code duplication by extending classes and check on that, using “Copy and Paste Detection”.

Using Amazon an EC2 instance as an off-site CronJob

I run the CCHits.net website, and part of the day-to-day tasks that running the site entailed was the daily show creation which involved creating some text-to-speech audio for insertion into the podcasts. As I run the website on shared hosting, to which I didn’t have full access to the OS, I couldn’t just install Festival on the platform, and for whatever reason (I never did figure out what went wrong), I couldn’t build Festival to run on the shared host either.

Until “The Big Rewrite” (the capitals are totally worth it), I’d been doing the text-to-speech on my home server, but frankly, I’m on DSL, which meant I needed to set up Dynamic DNS, I had to be sure the server was always up (and it wasn’t!), etc, etc, etc. While I was looking into why I couldn’t get Festival to build, someone said “Well, why not just use EC2 to do it”.

After nearly a year of faffing about trying to make the …….. stupid thing work (as is testified by the draft in this very blog called “How I built my Audio Stack for CCHits”), I finally decided to spin up an EC2 instance for just this one task.

Now, I’m not the greenest guy on the block – hell, I drive 45 minutes into work each day, but I figured, why keep an EC2 instance running all the time, when I only need it for less than 20 minutes each day, so I did some reading, and found a post on making EC2 do the hard work for you, using the scalable computing APIs, but frankly, all I actually needed was the code to make it spin up, run the task and shut down again, especially as with using his methods, I’d have needed to either create an AMI image, or download the festival voice files each time… at around 100Mb. Not good. I ended up using the stuff I did know (bash scripting, cron tasks) and ditching the stuff I didn’t (AMI files, scalable computing API). I may revisit this later to do it the way he said instead. We’ll see :)

So, here’s the crack.

Create your EC2 image. It doesn’t need to do anything fancy yet – just boot up and keep running. You’ll do some tweaks later. Make a note of the instance number – it’ll probably start i- and then 8 or so hexedecimal digits, like this: i-12345678.

On your shared web host, download the EC2 API tools. According to this page, the API tools are available from here. The first link of those two is unlikely to change for a while, the second, maybe more so. You’ll need to make sure you have Java installed on that web host.

Once you’ve got the tools, you’ll need to create an X.509 certificate and key pair. See this page to find instructions. It was pretty straightforward.

So, you now have in, let’s say for sake of argument, your home directory:

  • /home/user/ec2-api-tools-x.x.x.x # The directory containing all the EC2 API tools
  • /home/user/ec2-keys/pk-{32[A-Za-z1-0]}.pem
  • /home/user/ec2-keys/cert-{32[A-Za-z1-0]}.pem

Also, you have java in /usr/bin/java.

Create the following script also in /home/user/ec2-api-tools-x.x.x.x – I called it ec2-wrapper.

#!/bin/bash
export EC2_HOME=/home/user/ec2-api-tools-x.x.x.x
export JAVA_HOME=/usr
export EC2_KEY=/home/user/ec2-keys/pk-{32[A-Za-z1-0]}.pem
export EC2_CERT=/home/user/ec2-keys/cert-{32[A-Za-z1-0]}.pem
${EC2_HOME}/bin/$* -K ${EC2_KEY} -C ${EC2_CERT

Obviously, you should change your paths to match what you have. What this script does is to add the X.509 certs to every EC2 request, plus adds the appropriate java and EC2_HOME paths to the script before running it.

I set up a CRON job (using crontab -e) to schedule the regular startup of the instance. Here’s the entry from my crontab:

#M   H  DoM Mth DoW  Command (Regular Crontab columns)
30   1   *   *   *   /home/user/ec2-api-tools-x.x.x.x/ec2-wrapper ec2-start-instances i-12345678
30   2   *   *   *   /home/user/ec2-api-tools-x.x.x.x/ec2-wrapper ec2-stop-instances i-12345678

So, this runs the start task at 30 minutes past 1am, local server time, and the stop task at 30 minutes past 2am. The second one there is just to be on the safe side, as we’ll try to shut down the box once it’s finished processing anyway. This way, the maximum time you’ll be billed for is 1 hour of time each day.

I then logged into my EC2 machine, and created, then tweaked the script from the earlier blog post (the scalable computing one).

#!/bin/bash -x
exec > >(tee /var/log/user-data.log|logger -t user-data -s 2>/dev/console) 2>&1
#
# This code is licensed under an Apache License - for the original 
# code and license, please see the footer of this script.
#
# !!!IMPORTANT!!!
# Edit this file and change this next line to your own email address:
#

EMAIL=user@example.com

# Get some information about the running instance
instance_id=$(wget -qO- instance-data/latest/meta-data/instance-id)
public_hostname=$(wget -qO- instance-data/latest/meta-data/public-hostname)

# Wait 5 minutes in case we want to get in to stop this from doing stuff
Sleep 300

if [ -f /home/ubuntu/donotrun ]
then
  exit 0
fi

# Send status email
/usr/sbin/sendmail -oi -t -f $EMAIL <<EOM
From: $EMAIL
To: $EMAIL
Subject: Running EC2 Scripts

== Making sure everything is up to date ==

`cd /home/ubuntu/website-rewrite && su -c "git pull" ubuntu 2>&1`

== Running the scheduled task ==

`php /home/ubuntu/website-rewrite/scheduled_task.php`

== Notes ==

This email was generated on the EC2 instance: $instance_id

If the instance is still running, you can monitor the output of this
job using a command like:

  ssh ubuntu@$public_hostname tail -1000f /var/log/user-data.log

EOM

# Give the email some time to be queued and delivered
sleep 300 # 5 minutes

if [ -f /home/ubuntu/shutdownwhendone ]
then
  shutdown -h now
fi

exit 0

########################################################################
# For more information about the original version of this code see:
#   http://alestic.com/2011/11/ec2-schedule-instance
# The original code and its license are available on github:
#   https://github.com/alestic/demo-ec2-schedule-instance
########################################################################

So, with that, I get a script which runs on schedule, on an EC2 platform, I get a confirmation e-mail it’s running. It shuts itself down, and hopefully, keeps on trucking :)

CampFireManager Workshop

NOTE: Images have been removed from this post 2017-05-02

What is CampFireManager?

CampFireManager is a tool to schedule talks at a Conference or Unconference (such as a barcamp). It is written in PHP and uses a MySQL backend to store the data about the event. It optionally can use SMS messaging and Microblogging services (like identi.ca or Twitter) to perform routine tasks on the system.

A modified version (CampFireManager-Lite) of CampFireManager was recently used at OggCamp ’10, in combination with tools created by Xibo. Here’s a look at what you would have seen if you were there:

CampFireManager-Lite

This is the main timetable. Note, it only shows two talks before “now”, the talk on “now” and the next 6 slot’s worth of talks. This caused us problems on the Saturday afternoon when we tried to display the timetable for the next day. The only way we could do that was to change the system time on the server.

This is the now and next screen for the main stage, and a screen showing similar information for each of the stages. By default, each screen shows, in turn, the timetable, the “now and next” screen for all stages and then the “now and next” screen for each individual screen. This is entered by the database, but should *really* have been configured at the screen.

Administration was performed by organisers. Data was collected on sheets, and then entered into the system. Using an in-built version of the timetable, these were placed into the appropriate slots in the timetable, which then updated all the other screens.

This, however, isn’t how I originally envisaged CampFireManager. The full version of CampFireManager is designed to operate with minimal administrative overhead from on-site staff.

CampFireManager

Users (as well as support staff and administrators) would log into the website using OpenID. Several common providers are pre-populated and available using icons to select them.

On logging in, all users initially see a timetable, with no definitions of rooms, but with the complete timetable for this day. At the top, under the “Slot” times, is a hyperlink showing “New Talk”. Clicking on this brings you to a page where you can enter your talk proposal. Clicking on “Go” inserts that talk into your timetable.

If you wanted to enter some contact details for your talk (so that your peers at the event can reach you), this is done by clicking on “Amend Contact Details”, and entering your details. This will retrospectively correct your details on all the talks you’re giving.

By clicking on “Add other access method” – you can add an “authorization code” (which is a 5 character string of case sensitive, non-similarly rendered letters and numbers) to allow you to update your talks, your contact details or the talks you’ll be attending from other communication methods – SMS or XMPP being the two primary examples.

However, I realised that I could use this access method logic to allow administrators the ability to grant administrative access to other organisers, or to promote staff to “Support” staff. In the same way you’d enter a 5 character string to give the same rights to your phone as your web access, you could enter a 9 character string (again, non-similarly rendered, case sensitive letters and numbers) to become an administrator or support staff member. Here you see an Administrator’s view of the timetable, which adds two extra options at the top box – Provide support to attendees and modify config values.

So, let’s support an attendee. We enter some details, either their Auth Code (if they know it), or some details about them they entered previously – a blog address, e-mail, some other contact method, or in extreme cases, their name. If we’ve had to use any form of wildcard in the search (that is, I typed “Jon” into the name box), then we get prompted to confirm it’s them, and can give them their unique Auth Code. If they’ve not registered on the site, then click on Create New Auth Code to create a new user. In either case, we get a page showing “This is with an AuthCode: ". From here we can amend their contact details, propose a new talk on their behalf, or if there's a talk in the future, show their attendance (which, to be fair, here I couldn't do, as I'd not created a talk for them to attend).

Here’s the administrative screen, where you see options that have configured the screens and access methods.

There’s a mobile interface to the site, which is quite similar to the support staff’s view of the site, and again it uses OpenID to authenticate. It gives the user access to a slimmed down portal – showing all talks which are yet to occur (allowing the user to show or remove their attendance), just this user’s talks yet to occur, and allowing the user to add a new talk and amend their contact details.

That is all the interactive screens shown, so here’s the two most regularly seen, non-interactive screens, both with a scrolling box between the event title and the content saying (in this instance):

Phones: 07 88 24 64 91 8 on the Three Network (with 18% signal)
Website: https://home.north.sprig.gs/CampFireManager/
Mobile site: https://home.north.sprig.gs/CampFireManager/m/
Event Hashtag: #oc10
Identify with this service by sending
I <your name> [email:your@email.address] [http://your.web.site]
(there are more options for identification by going to the website)
Propose a talk by sending P <Time Slot> <Slots Used> <Talk Title>
Cancel a talk by sending C <Talk Number> <Time Slot> [Reason]
Rename a talk by sending E <Talk Number> <Time Slot> <New Talk Title>
Attend a talk by sending A <Talk Number>
Decline the attendance of a talk by sending R <Talk Number>
Note: You can combine multiple A and R commands in one message.
Statements surrounded with <> are mandatory options, those statements surrounded with [] are optional.
These commands should be sent to your preferred mobile service listed above.

The Timetable:

And the “Direction” screens (like the now and next screens on the CampFireManager-Lite):

The above URLs do work, and the system is available for experimentation. The code and ticket tracker (if you want to run your own, local, instance) is at http://code.google.com/p/campfiremanager. Please contact me if you want to get access to my instance of CampFireManager from an administrative or support perspective. Please note, the SMS engine is not running.