Many-to-many “m2m” relations are defined by a relation where one object is related to multiple other ones and vice versa, e.g. a bank account can be shared by multiple people and one person can be the owner of multiple accounts. This article will show how to create these particular aggregations.

On the example we will use for this article, we have 4 bank customers: Phil, Kate, Josh and Samantha; where Kate & Josh as well as Kate & Samantha have joint accounts, respectively named “Account – KJ” and “Account – KS”.

The difficulty of managing these relations when data modelling lies on the fact that you want to avoid counting several times the same amount.

On the left, you have the balances per account and on the right, the assets by each customer. If you sum the amounts of the customer’s assets you’ll end up with a total of 800, that is more than the sum of all bank accounts (600). Why? Kate and Josh’s joint account has a total balance of 100, Kate’s balance in this account is 100 so is Josh’s balance, but when aggregating both the total is not 200.

The figure below represents the same data as above, showing in one table only, both balances per account and per customer.

This ‘strange’ visual effect in the pivot table is due to the m2m relations as parent members (Accounts’ total balance) are possibly not the aggregation of their children (Customers) anymore. Note that in fact, Kate & Josh’s joint account has a total balance of 100.

Be most careful when manipulating m2m relations as not all analytical and reporting tools offer this functionality out-of-the-box and you can end up with miscalculated results.

MANY-TO-MANY RELATIONS IN ICCUBE

There are two ways setting up (out-of-the-box!) m2m relations in icCube. The first by creating two separate dimensions and using a bridge table, and the second by creating one dimension with two hierarchies.

The difference between the two methods is that the first will show all possible combinations even if they don’t exist by construction and the second method removes non-existent combinations.

Let’s explore how to construct the data model using both these methods: Many-to-Many relations using a Bridge table and Many-to-Many dimensions.

For both methods, the data is set as follows:

  • Customers’ table:
  • AccountName   , Customers
    Account – P       , Phil
    Account – K       , Kate
    Account – J       , Josh
    Account – S       , Samantha
    Account – KJ     , Kate
    Account – KJ     , Josh
    Account – KS     , Kate
    Account – KS     , Samantha
  • Balance table:
  • AccountName   , Amount
    Account – P       , 100
    Account – K       , 100
    Account – J       , 100
    Account – S       , 100
    Account – KJ     , 100
    Account – KJ     , 100
    Account – KS     , 100
    Account – KS     , 100

FIRST METHOD: MANY-TO-MANY USING A BRIDGE TABLE

The bridge table method is the usual m2m method. Construct both your dimensions separately: a multi-level dimension for Customers and another for AccountName.

Now on the Facts table, for linking Customers to its corresponding data, create a Bridge table as follows:

By doing so, you are creating a link between your two tables through the AccountName. Therefore, the customers are now associated to their corresponding account(s).

The bridge schema is available for download here. Unzip and then import into the icCube builder.

SECOND METHOD: MANY-TO-MANY DIMENSIONS

You may create m2m relations by defining several hierarchies in a same dimension. This will result in showing only combinations {Customer, Account} that exist on your table. This behavior is called autoexist, it performs a NON EMPTY to remove empty tuples, and is in fact faster than a NON EMPTY.

The data used is exactly the same as the method above. Create the dimension BankAccount from your Customers’ table containing two hierarchies: AccountName and Customers.

No need to create a bridge on your facts table. Find below a screenshot of the BankAccount dimension browsing, as well as the facts table for this second method.

The m2m dimension schema is available for download here. Unzip and then import into the icCube builder.

Find more information on Many-to-Many relations in our Documentation: Many-to-Many using a Bridge Table and Many-to-Many dimensions.

Weekly, monthly or quarterly reports? Getting an alert if a threshold is met on your real time data?

These two concepts don’t seem similar but the way to set them up is exactly the same on icCube! This article will show you how to create both these alerts.

PDF sent through Email

To send a PDF report through email, you first – and obviously 🙂 – need to create the dashboard you’d like to receive with a Printing Layout (check how to here)

The Alert feature is available on the Monitoring tab. Check below an example of how to fill in the form in order to have a weekly report received every Friday at 12:00.

The email received will indeed contain the PDF report attached, and the content of the email is editable, you can for instance type in the link for the live dashboard.

Several triggers are available to determine the frequency of alerts sent:

  • Cron
  • Continuous
  • Fixed Interval
  • Daily (once per day)
  • Daily (several times per day)
  • Once
  • On Data Changed

One last point is the ERROR message. If your dashboard contains real time data (see incremental load) and the cube fails to load and giving an error on your report, you can either:

  • Check the “Continue on error” box to allow sending the report even though it contains an error,
  • Or, not check it and the usual email is not sent. You can set up a mail “(On Error)” informing that a failure has occurred.

Alerting a threshold was met

This feature also allows you to set an alert dependant on an MDX expression, therefore if a special situation happens.

For instance, if you have a call center service in your company and people who have called in have been waiting for more than 10 minutes, or if a truck transporting the goods you sell is 2 hours or more behind schedule, etc.

Note that on the previous example the expression “TRUE” is used, showing that the alert does not depend on an MDX statement. An example of threshold can be:

IFF( ([Sales Amount], dtAsOfToday([time])) < ([Sales Amount], dtAsOfToday([time]).lag(1)) , TRUE , FALSE )

therefore sending the alert if the sales were lower than the previous day.

This alert system allows to take immediate actions on your daily work and can definitely help in business critical situations.

Following a recent requirement of one OEM customer, here we explore how to configure icCube for a highly available and redundant solution.

Overview

On the highest level, the customer solution is made up of two instances of icCube servers for the sake of performance and high availability: one main server and a hot backup, or failover, server. If for any reason the main server fails, the backup server takes the role of the main server.

Synchronizing the servers is achieved by updating the schemas in both servers; this process is transparent and does not need any special settings. When you are loading a schema, you perform this on both servers; schemas are defined with the same real-time strategy. However, it is more complicated for end users to make report modifications.

For icCube active as a Report Server, we need to maintain the consistency of user-saved reports across the system. Thus, we have to deal with read/write data consistency, and here is where things become complicated: the different servers must be kept synchronized to offer the same view at any time to all system users. Therefore, the main effort was to propose an architecture for the Web Reporting that can fit into the customer high availability cloud solution.

High Availability Web Reporting

Our customer faced this challenge mainly because several system users are considered self-BI users, meaning they can create new reports and edit existing ones. Thus, the overall system must support read/write distributed consistency.

This, for example, rules out a simple solution with several instances of icCube having the same set of reports; this would mean implementing a quite complex (and not fail-safe) mechanism to keep all instances synchronized.

The chosen implementation relies on the clustering feature of the underlying persistency layer used for Web Reporting. icCube relies on a JCR (Java Content Repository) implementation, more precisely on Apache Jackrabbit, which supports clustering features that make it ideal for that customer requirement when combined with a relational database (SQL Server in our case).

Apache Jackrabbit Clustering Feature

A Jackrabbit cluster means that the same JCR content is shared between all cluster nodes (icCube instances). All nodes will access the same persistent storage. For this cloud solution, an SQL Server with failover, this ensures the end-to-end solution implements high availability and failover.

Each icCube instance that makes up the overall system is then configured to access a Jackrabbit cluster node, meaning all icCube servers are seeing the same JCR content.

Without going into much detail, a JCR node uses:

  • ia local file system to save repository global states,
  • a persistent manager for the actual content,
  • a data-store to persist data “too large” for the persistent manager itself.

You can find detailed information about a cluster configuration from the Jackrabbit WIKI. But it basically means the persistent manager must be clusterable, which is the case when using a DB backend. In the case of our customer, the overall solution was already using a Microsoft SQL Server, so it became a natural fit for the icCube JCR configuration.

For more details about the actual configuration, icCube comes with both a Microsoft SQL Server example (icCubeRepository-cluster-mssql.xml) and a Postgres example (icCubeRepository-cluster-postgres.xml).

Last Words

Thanks to its modular architecture and respect of well-accepted standards (e.g., HTTP, J2EE, JCR), icCube can be deployed (and is deployed) in high availability solutions.

References

Jackrabbit Cluster configuration

icCube JCR configuration:

What is NoSQL Reporting?

NoSQL Databases are great for high scalability and flexible data storage but reporting can be tricky because of the lack of defined structure and the performance cost. Multi-Dimensional Reporting tools are great for light speed analytics and reporting on structured data but do not work well on fast changing unstructured data. Why not combine the best of both worlds? MongoDB Reporting with icCube allows you to create advanced reports with NoSQL databases.

MongoDB Reporting with icCube

mongodb reporting with icCube

As a new feature icCube integrates natively MongoDB: Map/Reduce, Aggregation and complex JSON Object Types are now features fully available in all icCube versions (including free Community).

MongoDB Reporting with icCube brings fast real time in-memory data analysis to MongoDB and in a near future to other NoSQL databases. Once the data has been imported into icCube, you can use the full power of MDX+ for the most demanding data analysis. You will be able to report with your favorite XMLA reporting tool (e.g. Excel) as well with icCube’s Reporting.

Note that this MongoDB data can be combined with data coming from other data sources (e.g. RDBM, Excel, CSV, BigQuery amongst others) to create more complex and powerful analysis.

Let’s imagine we’ve the following collection/documents in MongoDB:

{
    customer: {
        id: "2342WKO"
        gender: “male”
    },
    shoppingCart: {
        id: "423KIL"
        date :”2012-02-23T12:23:00.000Z”
    },
    products: [{
        sku: "2334FGH",
        quantity: 1
    }, {
        sku: "8764KJU",
        quantity: 2
    }]
    ...
 }
How to integrate into icCube ?

First we have to bring the data into the icCube server. For this we’ve two different ways of creating queries: aggregation pipelines vs map/reduce commands. Both will return a list of documents (JSON records).

When importing these documents icCube will automatically transform the JSON records into structures for optimal many to many modeling. There is no need to denormalize, unwind, each JSON record, for example generate multiple lines for each product.

This new features will keep minimum data on the wire, reduce memory footprint as well as improve overall MDX+ query performance.

Once done you’ll have to create your dimensions and measures matching your business attributes and you’d be ready for data analysis and reporting.

Last Words

This is the end of this quick overview of MongoDB support in icCube. We invite you to experiment with MongoDB by downloading icCube. For more questions and details please use our forum.