Wednesday, February 25, 2015

Data Loader for Celebrus

It is just a few weeks since we completed the acquisition of Celebrus by IS Solutions. Since then we’ve moved offices from Newbury to Sunbury-on-Thames. Not surprisingly that involved clearing out a whole heap of stuff which has been lying around for far too long. Along the way we unearthed several dusty product release CDs going back over a decade.

Looking back at those earlier releases, it’s interesting to contrast the focus back then with our latest release (v8 update 11). Ten years ago, data collection was focused primarily around reporting. Lots of totals, averages and aggregations of one kind or another. And the technology matched those requirements. In Celebrus terms, this was, and still is, implemented by our Analytics Server, part of our v8 Big Data Engine. Every so often the Analytics Server fires up and calculates summary information from activity in the last five minutes, hour, day etc. The results of that processing is written to a set of database tables. There’s nothing inherently wrong with the Analytics Server approach, it is simply that the world has moved on. 

The focus today is almost exclusively on highly detailed data about individuals, not just summary information. The data also needs to be available in near real-time. This information is crucial to understand each and every journey a customer has had with your brand. Armed with this insight into customer behaviour, a whole slew of possibilities unfold which enable you to understand and optimise your business, whether that be to offer a discount to a valuable customer, or to understand why someone chose a competitor’s product. All these use cases and many more start with data.

As Sherlock Holmes once said:

It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.

So with all this in mind, you will see in our latest release our new Data Loader. The Data Loader is our go-forward architecture for loading data at lightning fast speeds. The Data Loader scales to support huge amounts of traffic on some of the busiest web sites in the world, and can process 10’s of thousands of events per second (sustained). Not just that, it delivers the data into your systems in less than a minute, making new use cases around streaming analytics possible. We support MySQL, Microsoft SQL Server, Oracle and Teradata out of the box. Better yet, the Data Loader includes a pre-defined database schema covering some 75+ tables and models: everything you might want to understand about your digital customers.

In addition we’ve been working hard with the folks over at MongoDB. This release has been fully certified with MongoDB Enterprise Edition. This makes MongoDB the perfect data store for Celebrus customer journey data. This customer journey data is focused towards operational applications. For example, contact centre staff use this information to help them understand a customer’s interactions with your brand.

This is the first release where we have worked with a document database, and it has been a really good experience. The flexibility, simplicity and productivity of MongoDB is tremendous. For example, in MongoDB we simply store all business events in a single collection (rather than lots of normalised relational tables). Each type of business event contains some common attributes (timestamp, session number, customer identifier, event type and so on). Each event type also contains some more specific information, for example the purchase price, quantity and SKU code for a purchase transaction event. All of this just works with MongoDB, no friction, no joins, no complexity. Job done!

Interested in reading more? These slides walk through the Data Loader.

Tuesday, September 30, 2014

So What's The Context?

Taking words out of context is one of those distortions which never fails to amuse. Whether it be entertainment, politics, or history, everything seems vulnerable to removing a few choice words. And context isn’t just reserved to just the world of words, photography is fair game as well, as this example illustrates:

So why is context important for events and data collection? The answer is that an event, perhaps a customer visits a certain page on your web site, is rarely clear in a vacuum. The information surrounding the event provides a perspective through which the event makes sense and can be accurately understood.

In this example, perhaps the customer is visiting a product page for lawnmowers and has done so a few times in the last week. The result of understanding the context is that we can make better decisions. A marketer might like to offer the customer a discount to turn their repeated browsing into a purchase.

Every event which the Celebrus platform collects and processes has a context added to it. What does this mean in practice? Well actually it’s really simple. First we identify the core attributes for an event, for a completed transaction that might be: the currency, the total purchase value, the set of items in the basket. Then we supplement this with a much larger set of context attributes which complete the picture.

For example, a few of the context attributes include the customer email address, where in the world they are browsing from, and what kind of device are they browsing on.  This combination of the key (primary) attributes, and the context attributes, defines the event. And every event can be fed in real time into your data warehouse.

This understanding of context becomes even more critical when we talk about path analysis. Path analysis allows us to chart a voyage over time for some set of attributes. For example, a simple path analysis allows us to view the set of pages a customer visited before they left to join a competitor. Understanding pages might be very useful to identify potential churn. This ability to gather really meaningful business information is why feeding data into our partner products like Teradata Aster is one of the biggest growth areas in our product.

Interested in more blogs from the Celebrus team? Click here to get to our data hub.

Friday, July 25, 2014

Top Tips for Looking at Data Analytics – the Five W’s

Celebrus is all about data. Specifically data which tells you what your customers are doing and via which channels. If I were a marketer, instead of a software developer, I’d probably call this ‘audience behavioural intelligence’, or some such fancy term. When we talk about data we really mean events. Events are the lifeblood of Celebrus. Our software captures, crunches, transforms and processes billions of them every day. And it’s these events which give you the insight into what your customers are doing.  

So what does this all mean in practice?

The answer we will explore in this blog post is guided by one of the world’s greatest writers of childrens’ stories, namely Rudyard Kipling.

Let’s get started with an extract from The Elephant’s Child:

I keep six honest serving-men:
    (They taught me all I knew)
Their names are What and Where and When
    And How and Why and Who.

When considering events, the what, where, when and who are exactly the questions we need to ask. And by using data analytics to develop deep insight from the data you can answer not just those questions, but also the why and how.

Every time an individual does something on your web site, an event captures essential information about their activity. For example, a customer might type some words into a text box and click ‘submit’. On the face of it that’s not very interesting, but stay with me, it gets better.

Celebrus can tell you whether an individual visitor arrived via a search engine or typed the URL directly into the browser, as well as understanding the referring campaign information. It also lets you know that during their visit they viewed several products in the gardening department and put one into their basket but sadly left the site before completing a checkout. You’ll also learn that the same visitor returned a few days later following an abandoned basket email they received and did indeed complete their transaction this time. That customer is now in the top quartile of most valuable individuals to your web site, and has been for over a year.

A simple example, but hopefully it illustrates the point. What Celebrus does is to take that torrent of events coming from your web applications (and elsewhere, such as mobile applications and social media interactions), and convert them into a stream of contextualised business events. The business events identify who the customer is, what they did (transaction, wish list add, search, basket add and much more), where in the application it happened, and when it occurred. The business events are fed into your data warehouse in real-time, where they are available to really understand your online business.

It’s this ability to capture, collect and transform business events which sets Celebrus apart.

Interested in more blogs from the Celebrus team? Click here to get to our data hub.

Tuesday, December 17, 2013

Celebrus v8

Six months have flown past since I joined Celebrus, and with no further ado I think it’s time to post some information on our latest product release, namely Celebrus v8. This release is hugely important to our clients, rolling together an incredible amount of work over the last couple of years. Given that we’re only a small company, we punch way above our weight. Not heard of us? Well we provide customer analytics on 20K+ web sites worldwide. We sell almost entirely through our channels partners (OnMarc, SAS and Teradata to name but three). 

A very brief summary of what’s new in our latest release:

Real-Time Event Stream

The most important feature in this release. We have for many years provided an analytics facility to crunch data in Celebrus and then push the summarized information out into data warehouses for reporting (Oracle, MySQL, Teradata et al). With this release we can push the detailed event data out in real-time (CSV is often the data format of choice).

This approach is perfect for the new range of data crunching applications like Teradata Aster and Hadoop. Ultra low latency is the key to making this fly. Teradata Aster enables a wide range of data discovery capabilities including path and graph analysis.

Semantic Configuration

One of the unique features of Celebrus is that we turn low level events (clicks, form submits, field interactions et al) into meaningful business events like checkout, basket add, product views, marketing campaign attribution, customer behaviours and much more.

Furthermore, this isn't just for web site users, but is across all channels.

We do this without any tedious, costly and error prone web page tagging. How do we do this? The solution is to configure the Celebrus system with the meaning of certain events, our so-called semantic configuration. You simply configure the system with how to interpret user actions and we turn those low level form submits into a set of meaningful search terms and so on. Easy to do and quick to change as your customer channels evolve.

Real-Time Decision Management

There is some very smart event processing software out there. A good example is Teradata RTIM. RTIM enables you to push personalised messages to your customers in real time through Celebrus. Celebrus provides the customer identification (who) and the reliable delivery platform (where/how) and RTIM does the really interesting part of deciding the content (what/when). Blend into that an understanding of what has worked in the past and RTIM can use its integrated predictive models to fine tune the offers.


With this release we’ve pushed the envelope way out. Even on some fairly run-of-the-mill multi-core hardware (by today’s standards) we can process 25K+ events per second, sustained. The performance graph below is from one of our test  machines running in a Jenkins cluster. Wondering about the uptick half way through? The result of some tuning around the Java CMS Garbage Collector, but that's another story.

Integrated Security and Management

A single place to manage users and roles across the system. Many of our clients deploy Celebrus to cope with huge event rates (10K+ events per second sustained is common) and so not surprisingly there is a need to scale for performance and reliability reasons. With our latest release, a Celebrus cluster can be managed through a single smart client interface.

I’ve posted some slides here which explain a little more about our company and technology.

Sunday, April 7, 2013

Healthcare Analytics with Message Broker

One feature I would like to call out in the new version of the Healthcare Connectivity Pack is our healthcare analytics capability. This feature makes it incredibly easy to find, search and extract meaningful data from a range of healthcare data sources (HL7 v2, DICOM and CDA). The tooling can do this because it understands these healthcare formats, not just at a syntactic level but the semantics as well. This knowledge of a standard is encapsulated in what we call a Data Analysis Profile (DAP).

For example, in the case of clinical documents, the CDA DAP understands the HL7 v3 RIM and structural attributes (such as templateID and root). With this understanding the tooling can pick apart clinical documents and show you the meaning of different sections (medications, procedures etc). This is just the tip of the iceberg, the tooling makes it a breeze to do really complex work with clinical documents.

Another example are HL7 v2 observation results, the HL7 DAP understands where the coding system is stored in the OBX segment. Armed with this information the tooling deciphers LOINC codes to present user friendly display names when you are working with your data. In fact you can plug in your own bespoke coding systems as well (a common requirement in providers).

Understanding and working with these healthcare standards is one big step forward, but the tooling goes way further. The tooling builds for you the Message Broker solution to validate inbound messages and extract the data you want. The data can be extracted into a simpler XML message format. This is the ideal approach if you want to send some extract of the healthcare data to a downstream application. Alternatively, the tooling can extract the data directly into database tables. In fact, the tooling will even design you the database table layout, key selection and everything else!

If you are interested to learn more, take a look through these slides I have posted.

Medical Imaging (DICOM) with Message Broker

One of the key things we focus on with the Healthcare Connectivity Pack is to have a single healthcare ESB which can work with a wide range of protocols, standards, transports and formats. This focus is becoming ever more important as healthcare organisations seek to join up more sources of data whether they be clinical, financial or anything else for that matter.

In many healthcare organisations I talk to DICOM has traditionally been kept separate from the main line integration of clinical applications (often using HL7). This legacy is quickly disappearing though accelerated by many factors not least of which is work being done by IHE (for example, the scheduled workflow profile which requires integration between HL7 and DICOM modalities).

The latest version of the Healthcare Connectivity Pack includes support for medical imaging (DICOM). This enables a wide range of high value use cases. The diagram below shows an example walk through of how the Healthcare Connectivity Pack can route DICOM instances to external experts for a second opinion. This scenario comes in various flavours. For example, perhaps you have several radiologists and you want to work load balance between them. Alternatively, radiologists with specialist skills might be geographically dispersed and so DICOM images can be routed to the correct location.

Another use case is to pre-fetch DICOM instances when a patient is admitted to a hospital:

This scenario uses the Healthcare Connectivity Pack to pull down DICOM images to the local PACS so that the right information is in the right place for a patient's effective treatment. These scenarios uses a range of DICOM capabilities including C-FIND, C-STORE and C-MOVE.

I've posted some slides here which walk through all the technical details of the DICOM integration.

Healthcare Integration with Message Broker

We have just released our latest and greatest healthcare integration product, the Message Broker Healthcare Connectivity Pack. In the last three years we have made huge steps forward in the healthcare integration space, indeed we have had four product releases in the last two years alone.

This new release comes packed with a host of new features:

The new capabilities augment the existing capabilities of the Healthcare Connectivity Pack:

I've posted some slides here which walk through the high level features now available.