December 2011 Archives

I coded a simple JSON web service, which converts any HTML input to PDF (using the excellent wkthmltopdf software). You can supply HTML code or an URI where to get it.

Using the powerful Mojolicious web framework (Mojolicious::Lite is enough for this application, actually) framework it's just roughly 100 lines of code for the whole thing:

#!/usr/bin/perl

# Pulls in strict and unicode_strings, but this
# program doesn't require perl 5.14 to work
use v5.14;

use Mojolicious::Lite;
use Path::Class;
use File::Temp;
use Mojo::UserAgent;
use MIME::Base64;

my $config = {
    wkh     => '/usr/local/bin/wkhtmltopdf',
    tmpdir  => '/tmp',
    auth    => 'maitai',
};

post '/' => sub {
    my $self = shift;
    
    my $args = $self->req->json;

    # Handle obvious error cases
    return $self->mkerror('invalid-JSON-content')
        if !defined $args;
    return $self->mkerror('invalid-auth-information')
        if $args->{auth} ne $config->{auth};
        
    # Clients can pass us HTML content or an URI where to fetch if
    if ( !defined $args->{html} ) {
        return $self->mkerror('no-html-nor-uri')
            if !$args->{uri};
        
        # Fetch the page
        my $ua = Mojo::UserAgent->new();
        my $tx = $ua->get($args->{uri});
        my $res = $tx->success;
        if (!$res) {
            my ($msg, $code) = $tx->error;
            return $self->mkerror("fetch-page-error: $msg");
        }
        $args->{html} = $res->body;
    }

    my $html_file = $self->make_html_file( $args->{html} );

    my $pdf_fn = $html_file->filename;
    $pdf_fn =~ s/\.html/.pdf/xms;

    # Build the command line
    my $hcmd = $self->build_wkh_command($args);
    $hcmd .= ' ' . $html_file->filename . " $pdf_fn";

    # Create the PDF file
    my $output = `$hcmd`; # TODO: error handling

    # Read the output and return it
    my $pdf_file = Path::Class::File->new($pdf_fn);
    my $pdf = $pdf_file->slurp();

    # Unlink the PDF file
    $pdf_file->remove();

    return $self->render_json({
        status  => 'ok',
        pdf     => encode_base64($pdf),
    });
};

helper build_wkh_command => sub {
    my ($self, $args) = @_;

    # Usual page size A4, but labels would need a smaller one so we leave it
    my $page_size = '--page-size ' . ($args->{page_size} || 'a4');

    # Custom page size will override the previous
    if ( defined $args->{page_width} && defined $args->{page_height} ) {
        $page_size = "--page-width $args->{page_width}"
            . " --page-height $args->{page_height} ";
    }

    # Build htmldoc command line
    my $hcmd = $config->{wkh} ." --encoding \"utf-8\" $page_size ";
    $hcmd .= "--margin-top $args->{top_margin}mm "
        if defined $args->{top_margin};
    $hcmd .= "--margin-left $args->{left_margin}mm "
        if defined $args->{left_margin};
    $hcmd .= "--margin-bottom $args->{bottom_margin}mm "
        if defined $args->{bottom_margin};
    $hcmd .= "--margin-right $args->{right_margin}mm "
        if defined $args->{right_margin};
    $hcmd .= "--orientation $args->{orientation} "
        if defined $args->{orientation};

    return $hcmd;
};

helper make_html_file => sub {
    my ($self, $html) = @_;

    my $htmlf = File::Temp->new(
        DIR     => $config->{tmpdir},
        SUFFIX  => '.html',
        UNLINK  => 1,
    );
    binmode $htmlf, ':encoding(UTF-8)';
    print $htmlf $html;

    return $htmlf; 
};

helper mkerror => sub {
    my ($self, $error) = @_;
    
    return $self->render_json({
        status  => 'error',
        error   => $error,
    });
};

app->start;

Once you have this setup (via CGI, FastCGI, morbo, starman or whatever you like best), you just need to POST you data via JSON, and it could be something like:

{
    "auth"    : "maitai",
    "html"    : "<html><head><meta charset=UTF-8></head><body>Ciao!</body></html>"
}

or:

{
    "auth"    : "maitai",
    "uri"     : "http://www.skm.to/"
}

and you get a JSON response such as this:

{
    "status"  : "ok",
    "pdf"     : "pdf_data_base64_encoded"
}

The PDF data is base64 encoded in order to be safely transferred without risking corruption because of character set encoding/decoding.

Some notes:

  • There are some configuration options (page size, ...): take a look at the source code.
  • Not all wkhtmltopdf features are implemented, but it's very easy to extend the software.
  • Authentication system is just an example, it should be way more robust.
  • Arguments should really be checked for safety, otherwise security issues could arise (well, provided the authentication system is broken before).
  • Error handling should be improved.

Why did I do this? Basically, I have some web apps hosted under a managed FreeBSD server, where compiling wkhtmltopdf does not work very well (and there are a lot of pre-requisites, anyway). This way I can "outsource" PDF generation easily.

Nasce Udine Programmers

| No Comments

È nato Udine Programmers, un'idea che si propone di diventare un punto d'incontro per i programmatori (e le figure "collegate" come sistemisti, web designer, ...) di Udine e del Friuli.

Questo è l'attuale breve manifesto:

Studi o lavori su qualcosa che ha a che fare con la programmazione?
Vuoi conoscere altri programmatori?
Hai un progetto e cerchi collaboratori?
Cerchi lavoro come programmatore?
Ti servono consigli relativi alla programmazione?

Sull'esempio del Programmers in Padua, l'idea è quella di creare un punto di incontro per persone che si occupino di programmazione e attività correlate, in Friuli.

Tra gli scopi dell'iniziativa: permettere a persone che fan lavori simili di conoscersi (e quindi anche di scambiarsi possibilità di lavoro), organizzare meeting tecnici, cenare qualche volta assieme.

Professionisti, appassionati, studenti sono i benvenuti, come chiunque altro!

Ti aspettiamo!!!

http://www.udineprogrammers.it/

About this Archive

This page is an archive of entries from December 2011 listed from newest to oldest.

November 2011 is the previous archive.

April 2012 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Categories

Pages

OpenID accepted here Learn more about OpenID
Powered by Movable Type 5.14-en