Tuesday, February 17, 2009

Singletons, BIRT, Threads and PHP

PHP is designed to run in multithreaded web servers (such as Apache and IIS). What this means in practice is that a single PHP process concurrently processes multiple browser requests. It does this by having multiple threads, each of which handles one HTTP request at a time.

PHP can also be built for non multithreaded web servers. In this deployment model life is somewhat different because the web server spawns multiple PHP processes. When requests arrive in the web server it farms them out among its pool of PHP worker processes. Each worker process handles just one request at a time. 

The pool of worker processes can grow and shrink over time to allow for busy periods. Zero is multithreaded so it handles multiple web requests concurrently in the same process.

I’ve been playing with the Eclipse Business Intelligence and Reporting Tools project (BIRT). BIRT is an open source Eclipse project that provides a whole bunch of stuff around designing and deploying reports and charts. It has a Java API and so I thought it would make a nice demonstration of the PHP/Java Bridge in Zero.

The problem is that BIRT needs to be started once, and only once, in any given process. Once it has been started up, then it is plain sailing to load reports and render them to HTML. The BIRT runtime should be kept around until the process shuts down. 


This sounds suspiciously like a singleton pattern (or anti-pattern if you prefer). The trouble is that process wide singletons are hard to implement correctly in PHP. You might think the following snippet would work:

class Singleton {
private static $birt = NULL;
    static function getInstance() {
  if (self::$birt == NULL) {
    self::$birt = new BIRT(...);
  }
  return self::$birt;
    }
}

There are two problems with this. First, at the end of the current request, all statics are cleared away so $birt is reset to NULL. What we want is the BIRT runtime to be kept around forever (or at least until the process shuts down). 

The second problem is that on a multithreaded PHP runtime each PHP request gets an independent set of static and global variables. If two requests are being processed, and each calls getInstance, then they will both create an instance of the BIRT class. 


The design intent in PHP is that each request looks like it is running in a single threaded PHP process, even though it is actually running in a multithreaded PHP process. But what I want is one (yes really one!) object in a process which can be shared amongst all the threads. The reason this works is that the BIRT runtime is thread safe, meaning that it can be safely used by multiple threads at the same time.

The solution for me turned out to be the Zero Global Context. The Global Context is a storage area in Zero where just about anything can be put. Different bits of the Global Context have different life times. For example, the /request zone is only kept around for the current request. I opted to put the BIRT runtime in the /tmp zone so that it is kept around for the life time of the process. It is also available to all threads running in the same process.

The final piece of the jigsaw was to ensure only one thread actually created the BIRT runtime. Dealing with multiple threads can be something of a headache as race conditions are easily created.

The Global Context provides a solution (albeit slightly crufty) with locking commands. Each thread does a lock, checks to see if the BIRT runtime has already been created (if not then it creates it and puts it in the /tmp zone) and then unlocks.

With a bit of luck I'll get this written up in an developerWorks article soon as the reports look really sweet (cross tabs, charts, compound reports and so).

6 comments:

  1. Why not just instantiate it at the very start of your application and inject it?

    That way you can make sure its only started once; and you can write unit test coverage to prove it - injecting a mock / fake.

    Moar: http://misko.hevery.com/2008/08/17/singletons-are-pathological-liars/

    ReplyDelete
  2. There are a couple of problems. First where to store the BIRT runtime once it has been created. Globals, statics and more generally singletons in PHP are not process wide.

    The other problem is how to inject it into the web server process. This would require a run once PHP script that gets invoked (by HTTP?) after the web server starts, but before it starts processing real web requests.

    ReplyDelete
  3. Great post, looking forwards to the developerWorks article! You refer to the technique used by BIRT as a possible "anti-pattern". How would you prefer such an API to be structured? Is it possible for APIs that need one-off initialization to hide this step from consumers - or at least to be easier to work with? How so?

    ReplyDelete
  4. Yes that's a really good point. There are many Java libraries that have some "do once code". It's easy to call that code in rich client applications that have an obvious [start, do-event-loop, end] structure. In web applications finding somewhere to call the start/stop code isn't so obvious.

    In Zero we used to have an onStart/onStop event that got called when the application started up and shut down. The trouble with this design is that we started putting loads of stuff in there, even if it wasn't actually used.

    So for example, I might start up the BIRT runtime in the onStart event but the PHP process may never render any reports so that was wasted effort! A better approach is to lazily start BIRT when the first report gets rendered.

    The article will show how to do this with some Zero magic called the gcMiss event. gcMiss is explained in the Zero documentation but it doesn't have any examples so at the moment it is totally unobvious how it helps solve this problem!

    ReplyDelete
  5. We have done some work with BIRT and the PHP/Java Bridge Open Source project that may suit your needs. Take a look at this article.
    http://www.theserverside.com/tt/articles/article.tss?l=IntegratingBIRTwithPHP

    Jason

    ReplyDelete
  6. Great article, anyway, small error - Global Context url in article doesn't work.

    ReplyDelete