Academy

Following a recent requirement of one OEM customer, here we explore how to configure icCube for a highly available and redundant solution.

Overview

On the highest level, the customer solution is made up of two instances of icCube servers for the sake of performance and high availability: one main server and a hot backup, or failover, server. If for any reason the main server fails, the backup server takes the role of the main server.

Synchronizing the servers is achieved by updating the schemas in both servers; this process is transparent and does not need any special settings. When you are loading a schema, you perform this on both servers; schemas are defined with the same real-time strategy. However, it is more complicated for end users to make report modifications.

For icCube active as a Report Server, we need to maintain the consistency of user-saved reports across the system. Thus, we have to deal with read/write data consistency, and here is where things become complicated: the different servers must be kept synchronized to offer the same view at any time to all system users. Therefore, the main effort was to propose an architecture for the Web Reporting that can fit into the customer high availability cloud solution.

High Availability Web Reporting

Our customer faced this challenge mainly because several system users are considered self-BI users, meaning they can create new reports and edit existing ones. Thus, the overall system must support read/write distributed consistency.

This, for example, rules out a simple solution with several instances of icCube having the same set of reports; this would mean implementing a quite complex (and not fail-safe) mechanism to keep all instances synchronized.

The chosen implementation relies on the clustering feature of the underlying persistency layer used for Web Reporting. icCube relies on a JCR (Java Content Repository) implementation, more precisely on Apache Jackrabbit, which supports clustering features that make it ideal for that customer requirement when combined with a relational database (SQL Server in our case).

Apache Jackrabbit Clustering Feature

A Jackrabbit cluster means that the same JCR content is shared between all cluster nodes (icCube instances). All nodes will access the same persistent storage. For this cloud solution, an SQL Server with failover, this ensures the end-to-end solution implements high availability and failover.

Each icCube instance that makes up the overall system is then configured to access a Jackrabbit cluster node, meaning all icCube servers are seeing the same JCR content.

Without going into much detail, a JCR node uses:

  • ia local file system to save repository global states,
  • a persistent manager for the actual content,
  • a data-store to persist data “too large” for the persistent manager itself.

You can find detailed information about a cluster configuration from the Jackrabbit WIKI. But it basically means the persistent manager must be clusterable, which is the case when using a DB backend. In the case of our customer, the overall solution was already using a Microsoft SQL Server, so it became a natural fit for the icCube JCR configuration.

For more details about the actual configuration, icCube comes with both a Microsoft SQL Server example (icCubeRepository-cluster-mssql.xml) and a Postgres example (icCubeRepository-cluster-postgres.xml).

Last Words

Thanks to its modular architecture and respect of well-accepted standards (e.g., HTTP, J2EE, JCR), icCube can be deployed (and is deployed) in high availability solutions.

References

Jackrabbit Cluster configuration

icCube JCR configuration:

By David Alvarez