There are three things you should never ever do in PHP:

  1. Use short tags (<? and ?>) or ASP-style tags (<% and %>) rather than the standard <?php and ?>
  2. Use the addslashes or stripslashes functions
  3. Enable magic_quotes_gpc or magic_quotes_runtime

Actually there are a lot more things you should never ever do in PHP, but we’ll start with these.

The first rule should be self-explanatory. Many servers have short tags disabled, and I doubt there’s a single server in the entire universe with ASP-style tags enabled, so the only way to ensure that your application will work everywhere is to use the standard long tags.

I could write several articles all about the evils of magic quotes and why doing the slash-dance with addslashes and stripslashes is a surefire way to produce unsecure and unmaintainable code, but plenty of articles have already been written on this subject. Suffice it to say that if you find yourself using addslashes or stripslashes, it’s time to stop and rethink the design of your application. You should never need to use these functions. Ever. I’m about to describe a much better (and far easier) way to protect yourself from SQL injection attacks.

Now, let’s pretend we’re developing a web portal for monkeys and we’re calling it, oddly enough, Monkey.

First we’ll create a common script that will be included by every other script in our application. In this file, we’ll define our database settings, include our application’s classes, and initialize our framework. Because we’re so original, we’ll call it common.php. This concept shouldn’t be anything new to you if you’ve been developing PHP apps for any length of time.

common.php:

<?php
// Database settings.
define('DB_HOST', 'localhost');
define('DB_USER', 'myuser');
define('DB_PASS', 'mypass');
define('DB_NAME', 'monkey');

// Load classes.
require_once 'classes/Monkey.php';

// Initialize the application.
Monkey::init();
?>

Notice the last line: we’ve made a call to the static function init of class Monkey. This is the base class of our application, and init is the function that’ll open a database connection and get everything ready to go.

We’re using PHP 5 because we’re not silly, so we’ll make Monkey an abstract class. In essence, the Monkey class will serve as our application’s main namespace, keeping our own methods and variables separate from those of PHP. Since PHP’s global namespace is incredibly polluted, this will make our life easier down the road.

The init function will do two things: connect to the database and load an XML file containing all the SQL queries our application will be using. By keeping our queries in this XML file, separate from our code, we’ll make our application vastly more maintainable. Mixing SQL queries with PHP code makes for a big mess. In addition, we’ll use an incredibly simple (but powerful) form of parameter substitution to ensure that every single piece of data we send to MySQL will be properly escaped, thus protecting us from SQL injection attacks, all with almost no effort on our part.

Here’s an example showing how our XML file (which we’ll name queries.xml) might look:

db/queries.xml

<?xml version="1.0"?>
<queries>
  <!-- Table: user -->
  <query name="user.getByUsername">
    <![CDATA[
      SELECT *
      FROM user
      WHERE username = :username
    ]]>
  </query>

  <query name="user.insert">
    <![CDATA[
      INSERT INTO user (
        username,
        password,
        email,
        fur_color,
        height,
        weight
      )
      VALUES (
        :username,
        :password,
        :email,
        :fur_color,
        :height,
        :weight
      )
    ]]>
  </query>
</queries>

Note that you’ll want to either restrict access to the directory containing queries.xml or place it outside your web directory (I forgot to mention this in the first version of this article; thanks to Brett for pointing it out). You wouldn’t want the whole world viewing your SQL queries.

As you can see, we’ve substituted placeholders like :username and :email for incoming values. We’ve also named our queries descriptively in the form “table.action”. Later, when we start executing these database queries from PHP, this will help us understand what the queries are doing. If we were using MySQL 5 or PostgreSQL, we could accomplish something similar using stored procedures, but I actually tend to prefer this method (mostly because I find stored procedures harder to maintain and a bit overkill for most purposes).

Now that we’ve created our query file, let’s create our Monkey class, along with the init function and a query function, which we’ll call whenever we want to execute a database query.

classes/Monkey.php

<?php
abstract class Monkey
{
  public static $db;
  public static $queries;

  public static function init()
  {
    // Open database connection.
    self::$db = new mysqli(DB_HOST, DB_USER, DB_PASS, DB_NAME);

    // Load predefined queries.
    $queriesXml = simplexml_load_file('db/queries.xml');

    self::$queries = array();

    foreach($queriesXml as $query)
      self::$queries[(string)$query['name']] = (string)$query;
  }

  public static function query($name, $parameters = array())
  {
    if (!isset(self::$queries[$name]))
      throw new Exception("Undefined query: $name");

    $sql = self::$queries[$name];

    if (count($parameters))
    {
      $formattedParams = array();

      // Prepend a ':' to each parameter name and escape the value properly.
      foreach($parameters as $paramName => $paramValue)
      {
        if (!is_numeric($paramValue))
        {
          if (is_null($paramValue))
            $paramValue = 'NULL';
          else
            $paramValue = "'".self::$db->real_escape_string($paramValue)."'";
        }

        $formattedParams[":$paramName"] = $paramValue;
      }

      // Replace placeholders in the query with assigned values.
      $sql = strtr($sql, $formattedParams);
    }

    // Display the query if SHOWSQL is true (for debugging purposes).
    if (defined('SHOWSQL') && SHOWSQL)
      echo htmlentities($sql)."<br />\n";

    // Execute the query and return the result.
    return self::$db->query($sql);
  }
}
?>

Take a close look at the query function. This little function will save us tons of time and effort since it will do all the work of executing SQL queries and automatically escaping our input data.

Mull this over in your head a bit. Think about how you’d go about using the query function. Then check back tomorrow for the next part, in which I’ll provide some examples.

Comments

Ah... the progression from PHPSlice to Pants is a thing of beauty.

I might be jumping ahead here, but at this point the queries.xml file hasn't been secured. I wouldn't want someone opening queries.xml in their browser if they knew what they were doing, and using it to their advantage.

Should queries.xml sit outside the web root? Should one employ some mod_rewrite magic?

I've found that .htaccess files are an easy and effective way of hiding such things away.

I use a .htaccess file to deny access to the directory containing queries.xml, but you could just as easily place it outside the web root. I should probably have mentioned that in the article, though.

But of course! I wrote mod_rewrite and meant .htaccess. Thanks for clarifying.

I love this series, keep up the good work.

nice series. thanks.
but one question.
wouldn't it be better to use a signleton pattern instead of an abstract class?

Better? Not really. But not worse either. There'd be no real benefit to using a singleton pattern in this case, unless you just prefer it that way for aesthetic reasons. I prefer using an abstract class myself.

We're really just using this abstract class as a stand-in for a namespace, since PHP doesn't have namespaces.

ok, i just thought about using a singleton instead of an abstract class, because as far as i learned a abstract class is used as a base class to be extended with the actual functional implementation.
so mainly the use of a singleton would be for aesthetic reasons...

I liked the "class" approach, and tried it on my development server. I used this to access the db, and run queries, in my case I am creating a photo db. I did have a problem. the image display was slower using the class methods, etc. I use simple select statements to select info about the image from the db (image location, size, etc) The problem I saw, is that using the oo concept, things ran slower. When I switched the code back to the old-fashionsed in-line style ($qry = "Select * from table where id = $id and p = $p) and using $resultx = @mysql_query($qry);, etc) My web app in terms of displaying, ran faster.

I basically used the methods outlined in the article (a good article). I particularly like the way connections get established, etc. Would the fact that i am using oo type code and regular old-fashioned in line code to access the same db cause this problem (speed that is).

Thanks
ewholz

Object-oriented PHP has slightly more overhead than procedural PHP, so it will almost always run a little bit slower, but I'm surprised that you actually saw a noticeable speed difference.

It's hard to say without seeing the code in question, but I suspect that the OO code itself isn't entirely to blame for the speed decrease you noticed.

As far as I understand, PHP is a stateless language, so implementing certain design patterns are necessary unless the application you're writing forces persistence. I could be wrong.

What is the advantage of declaring Monkey as an Abstract class? Is it to keep this class to a single instance? The PHP docs on Abstract classes are not helping me to understand this.

Thanks Ryan. K

Yep. It’s the closest thing to a namespace in PHP 5. You could also use a Singleton, but an abstract class achieves essentially the same thing with less work.

Thanks Ryan!

This series is great for an PHP5 OO n00b.

K

Hi,

Sorry, I am new to all this. Is there a special reason you put all the queries in an XML file instead of just putting them in a queries.php file? I couldn’t see the difference between the two.

Thanks Z

I played around with both, and I found the XML file easier and more pleasant to work with compared to a PHP file with a big array of ugly multi-line strings. If you’d prefer to keep your queries in a PHP include, there’s no reason that wouldn’t work.

Bug thumbs up for your approach, and for sharing your thoughts! I’m looking for a holy grail of PHP/MySQL application frameworks myself. I really like the XML stored queries. I’m wondering though; you’re escaping arguments based on the is_numeric() function. Will this support SQL functions, esp. those that return numeric values? Those functions will look like non-numeric strings to your function and will be escaped. I’m now basing my escape routine on the column type of the table.

Careful. There’s a big difference between data integrity and security. The point of escaping isn’t just to ensure data integrity. If a user passes in a value containing quotes or other potentially dangerous characters, you must escape them before executing the query, regardless of the data type of the column. Otherwise you open yourself up to injection attacks.

I use is_numeric() because it will look at any PHP data type (string, integer, boolean, anything) and tell me if the data it contains is strictly numeric data (in which case it doesn’t need escaping) or whether there are non-numeric characters (in which case it does need escaping).

This way, any non-numeric value is treated as a string and escaped, regardless of the actual type of column it’s being inserted into. The query may fail due to a data type mismatch, but at least it won’t be vulnerable to injection.

We don’t need to worry about escaping the output of SQL functions because they can’t be exploited to allow injection of malicious commands.

Say that in your query function you needed to go further and validate that the keys provided in the parameters array matched exactly to the colon-prepended (:) names specified in the query definition (in db/queries.xml). I am utterly rubbish at regular expressions but this is what I am thinking:

if (count(array_diff(array_keys($parameters), preg_split('/colonPrependedPattern/', self::$queries[$name]))) !== 0) throw new Exception('Parameter array must match all query arguments!');

What pattern would you use to explode this string

INSERT INTO user (username, password) VALUES (:username, :password)

into this


array('username', 'password');

that could then be compared to $parameters?

Does any of this make sense?


Good work btw! :-)

i like this concept.
im going to try something similar with an ajax call as well…

Thanks very much for your two framework tutorials. I’ve been using your code and another database class from somewhere else to try to build my own two classes. However, I’m having a bit of a problem when calling the “save” function in the User class. It is calling the function in the Database class for saving the data. However, for some reason, when I reference $this→escape to point to a data-escaping function within that same class, the system “thinks” I’m referring to a $this from the User class that called it ???

My “save” function is almost identical to yours. And here’s the function within the Database class that I’m calling:

function query_insert($table, $data) {
$q=“INSERT INTO `”.$this→pre.$table."` ";
$v=‘’; $n=’’;

foreach($data as $key=>$val) { $n.="`$key`, "; if(strtolower($val)==‘null

Bah, it cut me off. Well, forget the code post. Function tries to do a $this→escape($string) and I get an error “Fatal error: Call to undefined method User::escape()” I can substitute $this→escape with self::escape and it works. But then it just breaks in a similar manner when $this is called later on.

I would recommend you use a preg_replace function instead of STRTR. The reason is that STRTR ignores any word boundaries you might want. Take an example of a query like: “INSERT INTO comments ( … ) VALUES ( comment_author = :comment_author, comment_author_email = :comment_author_email … )”

You should see the problem there. STRTR will replace all instances of ‘:comment_author’. That means the resulting sql could look something like: “INSERT INTO comments ( … ) VALUES ( comment_author = ‘colyn’, comment_author_email = ’colyn’_email … )”

Instead, you should use PREG_REPLACE, with a regular expression that looks for word boundaries:

foreach ( $formattedParams as $key => $val )
{
$sql = preg_replace(“/\b$key\b/”, $val, $sql);
}

Copyright © 2002-2012 Ryan Grove. All rights reserved.
Powered by Thoth.