Write no more SQL, abstract it!

I don’t even know how many SQL statements I wrote in my programming life so far. Embedding the SQL into your code, whatever language you use, makes it less readable, uglier and - especially if the complex queries - particularly inelegant. Even when coding in a MVC fashion (which is highly recommended in some cases, like web applications) you end up to just moving the ugly code to the model part of your program, while removing it entirely would probably be the best choice. The problem is that you actually need to query the database somehow, so you can’t remove the SQL code; however, you can put something else in place of SQL statements. Let’s see what, from the point of view of a Perl programmer of course. ;-)

There are modules which allow SQL abstraction by generating SQL code from data structures. They let you use an arbitrary complex structure to build an arbitrary complex query without inserting any database-related code into your Perl program. Note that this is different from what ORMs - Object Relational Mappers - do: the latter abstract the SQL database into objects, while what I’m talking about abstracts the queries. However, a good ORM such as DBIx::Class provides query abstraction as well, usually by relying on one of the modules available on CPAN. You’ll probably want to use an ORM, but that’s not what I’m discussing here.

In this short article I’ll cover SQL::Abstract, which is also the module DBIx::Class uses. It is fairly easy to get used to it to write simple queries, and with a bit of thought also the complex queries can factored into Perl data structures. For instance if you have:

my $query = "SELECT users.name, users.phone, users.email
    FROM users
    WHERE users.id = 1 OR users.id = 2
    ORDER BY id
";
my $sth = $dbh->prepare($query) or die $dbh->errstr;
$sth->execute() or die $dbh->errstr;

You can write this instead with SQL::Abstract:

my $sql = SQL::Abstract->new();

my ($query, @bind) = $sql->select(
    'users',
    [ 'name', 'phone', 'email' ],
    { id => [ 1, 2 ] },
    [ 'id' ],
);
my $sth = $dbh->prepare($query) or die $dbh->errstr;
$sth->execute(@bind) or die $dbh->errstr;

You might protest that the code is longer, and it actually is a bit (even though code formatting contributes to that). It’s however much clearer: there’s no extra-Perl code around field names and values, and you can easily understand what the query does at first sight. The select method accepts four parameters: the name of the table on which you want to work, the fields you want to retrieve, the where clause, and the optional ordering of the results. You get, as return values, a string containing the query, and the array of the bound values which you need to use when calling execute().

The where clause is the place where the real magic happens. You can use a Perl data structure to specify the conditions you need:

my $conds =    {
    id => [ '1', '2' ]
}

You can go on with arbitrary complex conditions:

my $conds = {
    -or => [
        id => [ '1', '2' ]
        -and => [
            name => 'Fatina buona',
            email => { '!=', 'joker@example.com' },
            phone => { 'like' => '%3456%' },
       ],
    ]
};

Here the “first level” conditions are contained inside a -or arrayref, which makes SQL::Abstract associate them one to the other with OR instead of the default AND operator. The id has to be 1 or 2 as in the previous examples; if it isn’t, all the three conditions inside the -and arrayref must be satisfied. As you can see, the second and the third conditions here are specified using an hashref instead of a single value: this allows to specify the operator to use when comparing (default is “=”). At this point you can probably work out yourself the SQL that this data structure generates.

You can also perform inner joins easily:

my ($query, @bind) = $sql->select(
    [ 'users', 'payments' ],
    [ 'user.name', 'payment.dsc' ],
    { 'users.id' => [ 1, 2 ] },
    [ 'users.id' ],
);

Using a similar syntax you can also perform operations other than SELECT. For instance, to insert a row in a table you can do:

my ($query, @bind) = $sql->insert(
    'users',
   [ 'Michele', 'my@email.com', '123 546 874' ],
);
$dbh->do($query, undef, @bind);

or - in a more clear and scalable way - you can say:

my ($query, @bind) = $sql->insert(
    'users',
    {
        name => 'Michele',
        email => 'my@email.com',
        phone => '123 546 874',
    },
);
$dbh->do($query, undef, @bind);

This syntax guarantees that, when reading your code, you understand at the very first sight exactly what operation it performs on the database. UPDATE statements are similar to INSERT ones, except for the fact that you might want to provide a where condition:

my ($query, @bind) = $sql->update(
    'users',
    {
        name => 'Michele',
        email => 'my@email.com',
        phone => '123 546 874',    # Default phone number
    },
    {
        id => 4,     # Update only the user where ID is 4
    }
);
$dbh->do($query, undef, @bind);

As you can see, this kind of code is also easier to comment. Finally, you can delete a record:

my ($query, @bind) = $sql->delete(
    'users',
    {
        name => { 'like' => 'M%' },    # Delete everything which name begins with 'M'
    },
);
$dbh->do($query, undef, @bind);

Easy, uh? You can also get the where clause and then use it in the statements you create without SQL::Abstract. This is not normally useful (why would you want to write part of a query by hand?) but it allows you to perform multi-table selects with joins different from inner ones (SQL::Abstract curretly supports only the inner joins as shown earlier) while continuing to use this module to build the where clause (which is recommended, as the where clause is the most complex and difficult-to-read part of the query). For example you can say:

my ($where, @bind) = $sql->where(
{
    'users.id'      => { '!=' => 1 },
    'users.birth'   => {
        -between => [ '1977-01-01', '1977-11.15' ],
    },
    'locations.name' => {
        'like' => [ 'A%', 'B%' ],   # Begins with either 'A' or 'B'
    },
},
[
    'users.birth', 'locations.name',
]
);
my $query = "
    SELECT users.id, users.birth, locations.name
    FROM users LEFT JOIN locations ON locations.id = users.idlocation
    $where
";
my $sth = $dbh->prepare($query) or die $dbh->errstr;
$sth->execute(@bind) or die $dbh->errstr;

If you need the LIMIT clause, the you can use the SQL::Abstract::Limit module:

my $sql = SQL::Abstract::Limit->new( limit_dialect => $dbh );

my ($query, @bind) = $sql->select(
    'users',
    [ 'name', 'phone', 'email' ],
    { id => [ 1, 2 ] },
    [ 'id' ],
    5, 10
);
my $sth = $dbh->prepare($query) or die $dbh->errstr;
$sth->execute(@bind) or die $dbh->errstr;

Where clause data structures can be as much complex as you like: even though subselects are not currently supported, you’ll be able to use all the functions and stored procedures your database server offers. Take a look at the documentation of the module: you’ll find more example and some details about its usage; this article does not want to document the module, it’s a showcase of its features and it wants to convince you to try SQL::Abstract.

Using DBIx::Class you can get the most out of SQL::Abstract, as you gain the ability to easy performs joins with several kinds of relations, group_bys, prefetches and (soon) subselectes. All of this without wiriting SQL. I’ll write something about this next time.

The Cattle Grid

Splashes of digital ink by Michele Beltrame

Write no more SQL, abstract it!