Tuesday, April 28, 2015

Ask Any Question

One of the most striking trends we have seen recently is how customers are using our product for multi-channel personalisation. Historically, the offers and promotions presented to web site visitors have been focused around an individual and their behaviours. For example, a visitor could be identified as interested in car insurance if they browse to pages involving car insurance, or search for the words car insurance and quote in the site search. Based on this segmentation, an appropriate offer could be presented. The key point in this example is that the segmentation is based on characteristics of that individual, in isolation.

Now consider a case where you want to offer your existing customers a discount or coupon, but only if they show an interest and haven’t yet purchased a car insurance product from you via any channel. Perhaps this offer is for just a few select customers based on their value to the business. So how best to calculate this value? Well a first pass might segment the customers by their location (for example, country), add up their purchases over the last few months, and then rank them with only the top n in each country receiving the offer. You can see straight away that this is a very different approach to the earlier example. To calculate this list requires a view across all the visitors and their activity on the site and other channels.


Total purchases is one measure but it misses some important facts. For example, how profitable are the products they purchased? To calculate profitability requires a view on the supply chain. Equally, you might also include in the calculation how much support have they required, either in calls to a contact center or call-outs (for example, home visits), to understand the profitability of that particular customer. To calculate this kind of most profitable customer requires many data sources to be integrated from across an enterprise.

The requirement to integrate many data sources and then action on the results has driven the latest integration in Celebrus, we call it Ask Any Question. With our latest release we can feed customer data into Hadoop and Teradata Aster, run wide ranging queries across huge data sets, and then action off the results.

Furthermore we provide a playbook which shows you step-by-step how to achieve this. One example feeds customer data into HDFS using Apache Avro. The data is loaded into Apache Hive where analytical queries determine the most valuable customers. The results are formatted using Apache Pig and pushed into Celebrus ready for presentation when the customers next visit.

Apache Avro is one of the latest generation of file formats adopted by the Hadoop ecosystem. It is designed to solve many of the problems inherent in binary sequence files (versioning, language independence and schema awareness). The Hadoop Data Loader creates Avro files and pushes them directly into HDFS using webhdfs. Once the files are in HDFS they can be added into Hive tables (and by extension, Impala) with the LOAD DATA INPATH command. This command is very efficient because it simply moves the files in HDFS into the Hive warehouse directory.

If you would like to learn more about the Celebrus Data Loader, take a look at these slides.


Monday, April 27, 2015

Data Visualisation

Multi-channel personalisation is just one use of the customer data Celebrus provides. Another equally valuable use of the data is for reporting and analytics. This can vary from high level dashboards for executives, through to deep path analysis in Teradata Aster. Making our data easy to use is a major focus for us. This is especially true with relational databases where data has to be joined across tables, a common cause of errors.



With all that in mind we now provide a simple to use set of data visualisation views. These views sit on top of the standard Celebrus data tables and do all the tricky stuff for you. They make it an absolute breeze to create compelling workbooks, dashboards and reports in whatever business intelligence tool you choose.

Along with these great product improvements we are also announcing our new partnership with Qlik. This is a major step forward for us, putting the right platform and data directly into the hands of decision makers across the business.



Wednesday, February 25, 2015

Data Loader for Celebrus

It is just a few weeks since we completed the acquisition of Celebrus by IS Solutions. Since then we’ve moved offices from Newbury to Sunbury-on-Thames. Not surprisingly that involved clearing out a whole heap of stuff which has been lying around for far too long. Along the way we unearthed several dusty product release CDs going back over a decade.

Looking back at those earlier releases, it’s interesting to contrast the focus back then with our latest release (v8 update 11). Ten years ago, data collection was focused primarily around reporting. Lots of totals, averages and aggregations of one kind or another. And the technology matched those requirements. In Celebrus terms, this was, and still is, implemented by our Analytics Server, part of our v8 Big Data Engine. Every so often the Analytics Server fires up and calculates summary information from activity in the last five minutes, hour, day etc. The results of that processing is written to a set of database tables. There’s nothing inherently wrong with the Analytics Server approach, it is simply that the world has moved on. 


The focus today is almost exclusively on highly detailed data about individuals, not just summary information. The data also needs to be available in near real-time. This information is crucial to understand each and every journey a customer has had with your brand. Armed with this insight into customer behaviour, a whole slew of possibilities unfold which enable you to understand and optimise your business, whether that be to offer a discount to a valuable customer, or to understand why someone chose a competitor’s product. All these use cases and many more start with data.

As Sherlock Holmes once said:

It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.

So with all this in mind, you will see in our latest release our new Data Loader. The Data Loader is our go-forward architecture for loading data at lightning fast speeds. The Data Loader scales to support huge amounts of traffic on some of the busiest web sites in the world, and can process 10’s of thousands of events per second (sustained). Not just that, it delivers the data into your systems in less than a minute, making new use cases around streaming analytics possible. We support MySQL, Microsoft SQL Server, Oracle and Teradata out of the box. Better yet, the Data Loader includes a pre-defined database schema covering some 75+ tables and models: everything you might want to understand about your digital customers.


In addition we’ve been working hard with the folks over at MongoDB. This release has been fully certified with MongoDB Enterprise Edition. This makes MongoDB the perfect data store for Celebrus customer journey data. This customer journey data is focused towards operational applications. For example, contact centre staff use this information to help them understand a customer’s interactions with your brand.

This is the first release where we have worked with a document database, and it has been a really good experience. The flexibility, simplicity and productivity of MongoDB is tremendous. For example, in MongoDB we simply store all business events in a single collection (rather than lots of normalised relational tables). Each type of business event contains some common attributes (timestamp, session number, customer identifier, event type and so on). Each event type also contains some more specific information, for example the purchase price, quantity and SKU code for a purchase transaction event. All of this just works with MongoDB, no friction, no joins, no complexity. Job done!

Interested in reading more? These slides walk through the Data Loader.

Tuesday, September 30, 2014

So What's The Context?

Taking words out of context is one of those distortions which never fails to amuse. Whether it be entertainment, politics, or history, everything seems vulnerable to removing a few choice words. And context isn’t just reserved to just the world of words, photography is fair game as well, as this example illustrates:



So why is context important for events and data collection? The answer is that an event, perhaps a customer visits a certain page on your web site, is rarely clear in a vacuum. The information surrounding the event provides a perspective through which the event makes sense and can be accurately understood.

In this example, perhaps the customer is visiting a product page for lawnmowers and has done so a few times in the last week. The result of understanding the context is that we can make better decisions. A marketer might like to offer the customer a discount to turn their repeated browsing into a purchase.

Every event which the Celebrus platform collects and processes has a context added to it. What does this mean in practice? Well actually it’s really simple. First we identify the core attributes for an event, for a completed transaction that might be: the currency, the total purchase value, the set of items in the basket. Then we supplement this with a much larger set of context attributes which complete the picture.

For example, a few of the context attributes include the customer email address, where in the world they are browsing from, and what kind of device are they browsing on.  This combination of the key (primary) attributes, and the context attributes, defines the event. And every event can be fed in real time into your data warehouse.

This understanding of context becomes even more critical when we talk about path analysis. Path analysis allows us to chart a voyage over time for some set of attributes. For example, a simple path analysis allows us to view the set of pages a customer visited before they left to join a competitor. Understanding pages might be very useful to identify potential churn. This ability to gather really meaningful business information is why feeding data into our partner products like Teradata Aster is one of the biggest growth areas in our product.

Interested in more blogs from the Celebrus team? Click here to get to our data hub.

Friday, July 25, 2014

Top Tips for Looking at Data Analytics – the Five W’s

Celebrus is all about data. Specifically data which tells you what your customers are doing and via which channels. If I were a marketer, instead of a software developer, I’d probably call this ‘audience behavioural intelligence’, or some such fancy term. When we talk about data we really mean events. Events are the lifeblood of Celebrus. Our software captures, crunches, transforms and processes billions of them every day. And it’s these events which give you the insight into what your customers are doing.  

So what does this all mean in practice?

The answer we will explore in this blog post is guided by one of the world’s greatest writers of childrens’ stories, namely Rudyard Kipling.

Let’s get started with an extract from The Elephant’s Child:

I keep six honest serving-men:
    (They taught me all I knew)
Their names are What and Where and When
    And How and Why and Who.

When considering events, the what, where, when and who are exactly the questions we need to ask. And by using data analytics to develop deep insight from the data you can answer not just those questions, but also the why and how.

Every time an individual does something on your web site, an event captures essential information about their activity. For example, a customer might type some words into a text box and click ‘submit’. On the face of it that’s not very interesting, but stay with me, it gets better.


Celebrus can tell you whether an individual visitor arrived via a search engine or typed the URL directly into the browser, as well as understanding the referring campaign information. It also lets you know that during their visit they viewed several products in the gardening department and put one into their basket but sadly left the site before completing a checkout. You’ll also learn that the same visitor returned a few days later following an abandoned basket email they received and did indeed complete their transaction this time. That customer is now in the top quartile of most valuable individuals to your web site, and has been for over a year.

A simple example, but hopefully it illustrates the point. What Celebrus does is to take that torrent of events coming from your web applications (and elsewhere, such as mobile applications and social media interactions), and convert them into a stream of contextualised business events. The business events identify who the customer is, what they did (transaction, wish list add, search, basket add and much more), where in the application it happened, and when it occurred. The business events are fed into your data warehouse in real-time, where they are available to really understand your online business.

It’s this ability to capture, collect and transform business events which sets Celebrus apart.

Interested in more blogs from the Celebrus team? Click here to get to our data hub.

Tuesday, December 17, 2013

Celebrus v8

Six months have flown past since I joined Celebrus, and with no further ado I think it’s time to post some information on our latest product release, namely Celebrus v8. This release is hugely important to our clients, rolling together an incredible amount of work over the last couple of years. Given that we’re only a small company, we punch way above our weight. Not heard of us? Well we provide customer analytics on 20K+ web sites worldwide. We sell almost entirely through our channels partners (OnMarc, SAS and Teradata to name but three). 

A very brief summary of what’s new in our latest release:

Real-Time Event Stream

The most important feature in this release. We have for many years provided an analytics facility to crunch data in Celebrus and then push the summarized information out into data warehouses for reporting (Oracle, MySQL, Teradata et al). With this release we can push the detailed event data out in real-time (CSV is often the data format of choice).

This approach is perfect for the new range of data crunching applications like Teradata Aster and Hadoop. Ultra low latency is the key to making this fly. Teradata Aster enables a wide range of data discovery capabilities including path and graph analysis.


Semantic Configuration

One of the unique features of Celebrus is that we turn low level events (clicks, form submits, field interactions et al) into meaningful business events like checkout, basket add, product views, marketing campaign attribution, customer behaviours and much more.

Furthermore, this isn't just for web site users, but is across all channels.

We do this without any tedious, costly and error prone web page tagging. How do we do this? The solution is to configure the Celebrus system with the meaning of certain events, our so-called semantic configuration. You simply configure the system with how to interpret user actions and we turn those low level form submits into a set of meaningful search terms and so on. Easy to do and quick to change as your customer channels evolve.

Real-Time Decision Management

There is some very smart event processing software out there. A good example is Teradata RTIM. RTIM enables you to push personalised messages to your customers in real time through Celebrus. Celebrus provides the customer identification (who) and the reliable delivery platform (where/how) and RTIM does the really interesting part of deciding the content (what/when). Blend into that an understanding of what has worked in the past and RTIM can use its integrated predictive models to fine tune the offers.

Performance

With this release we’ve pushed the envelope way out. Even on some fairly run-of-the-mill multi-core hardware (by today’s standards) we can process 25K+ events per second, sustained. The performance graph below is from one of our test  machines running in a Jenkins cluster. Wondering about the uptick half way through? The result of some tuning around the Java CMS Garbage Collector, but that's another story.




Integrated Security and Management

A single place to manage users and roles across the system. Many of our clients deploy Celebrus to cope with huge event rates (10K+ events per second sustained is common) and so not surprisingly there is a need to scale for performance and reliability reasons. With our latest release, a Celebrus cluster can be managed through a single smart client interface.

I’ve posted some slides here which explain a little more about our company and technology.



Sunday, April 7, 2013

Healthcare Analytics with Message Broker

One feature I would like to call out in the new version of the Healthcare Connectivity Pack is our healthcare analytics capability. This feature makes it incredibly easy to find, search and extract meaningful data from a range of healthcare data sources (HL7 v2, DICOM and CDA). The tooling can do this because it understands these healthcare formats, not just at a syntactic level but the semantics as well. This knowledge of a standard is encapsulated in what we call a Data Analysis Profile (DAP).

For example, in the case of clinical documents, the CDA DAP understands the HL7 v3 RIM and structural attributes (such as templateID and root). With this understanding the tooling can pick apart clinical documents and show you the meaning of different sections (medications, procedures etc). This is just the tip of the iceberg, the tooling makes it a breeze to do really complex work with clinical documents.

Another example are HL7 v2 observation results, the HL7 DAP understands where the coding system is stored in the OBX segment. Armed with this information the tooling deciphers LOINC codes to present user friendly display names when you are working with your data. In fact you can plug in your own bespoke coding systems as well (a common requirement in providers).


Understanding and working with these healthcare standards is one big step forward, but the tooling goes way further. The tooling builds for you the Message Broker solution to validate inbound messages and extract the data you want. The data can be extracted into a simpler XML message format. This is the ideal approach if you want to send some extract of the healthcare data to a downstream application. Alternatively, the tooling can extract the data directly into database tables. In fact, the tooling will even design you the database table layout, key selection and everything else!

If you are interested to learn more, take a look through these slides I have posted.