Academy

Shopping Cart Advanced Analytics with icCube and MongoDB

In our previous post, we introduced how to load shopping cart data stored in MongoDB to icCube leveraging the array support. Here we introduce the MDX+ SubCube functions of icCube. MDX+ are icCube specific extensions to the MDX standard.

The kind of question we’re going to answer here looks like, ‘How to make stats/count all shopping carts with TVs that have no cheese sales?’

Even though this query might not look as the most thrilling one, we might easily extend this for other scenarios and businesses as surveys and other customer behaviour scenarios.

Coming back to our query, there is no straightforward way to translate the question into MDX language. There is no way to create a filter defined as ‘all shopping carts with TVs but without cheese’. Why?

A short introduction to standard MDX

The MDX language works nicely in a universe without many-to-many relations but less so when those relations exist. By many-to-many we mean a hierarchy where several members, e.g., [United Kingdom] and [USA], are pointing to the same data. By definition, those members are from the same hierarchy, [Geo], and level, [Country]. Without many-to-many, defining a filter like [UK] – [US] has no sense as it is empty by construction.

An approach to understanding what is going on in MDX is to see MDX tuples as a way of defining/filtering a set of fact rows. In standard MDX, and with tuples without calculated measures, we can have two ways combining rows: Union and Intersects.

The tuple ( [USA], [2010] ): will be an intersect, aka logical-and, of the fact rows defined by [USA] and [2010].

A set {[USA],[UK]}: will be a union, aka logical-or, of the fact rows defined by [USA] and [United Kingdom].

We can add sets and tuples to the ‘where’ clause and to the subquery, but they impact how tuples and calculated members are evaluated. Unfortunately, this is all we have in standard MDX as set operations and combining them is an exercise of technique, experience, and imagination. Some operations are just overly complicated and using calculated measures might be error prone, poorly stable, and/or time consuming. Let’s say there is no proper support in MDX for basic set operations: combining union, intersection, and complement functions (logical and, or & not).

How to overcome this limitation

Since icCube 4.7, we have the MDX+ SubCube functions family:

SubCubeUnion, SubCubeIntersect, SubCubeComplement, SubCubeMinus, and SubCubeSymDifference.

Minus and Symmetrical Difference are a shortcut for combinations of the first three basic ‘brick’ operations. SubCubeMinus( A, B ) is a shortcut for SubCubeIntersect( A, SubCubeComplement(B) ). You can define your own set functions using MDX+ functional support(link here).

How do these functions work?

For example, let’s take SubeCubeMinus( [USA] , [United Kingdom] ).

This will filter all rows defined by [USA] minus the rows defined by [United Kingdom]. Note that the parameters of SubCube functions do not need to have the same dimensionality; SubCubeMinus( [USA], ([UK],[2010] ) is a valid MDX statement.

Before entering into an MDX example, feel free to try the following reporting example:

[icCubeReport reportName=”/shared/icCube Blogg/Shopping Cart – SubCube” width=760 height=820 iframe=false]

This is a dynamic report embedded with icCube’s WordPress plugin

An example of MDX syntax looks like:

WITH
  CATEGORY HIERARCHY [Stats].[Filter]
  CATEGORY STATIC MEMBER [Stats].[Filter].[All-M].[filter] AS SubCubeMinus( [Product].[Product Family].[Product Family L].[TV], [Product Plain].[Product Plain].[Product L].[Crab] )
SELECT
  [Measures].members ON COLUMNS,
  NON EMPTY [Age Group].[Age Group - L] ON ROWS
FROM [Cube]
WHERE [Stats].[Filter].[All-M].[filter]

SubCube functions return a SubCube object and currently can only be attached to category members. On the above MDX, we’re attaching our SubCube to a filter member that we can use as a normal MDX member.

We can also use categories and SubCube functions for axis, as in this example:

WITH
  CATEGORY HIERARCHY [Stats].[Filter]
  CATEGORY MEMBER [Stats].[Filter].[All-M].[With TV] AS {[Product].[Product Family].[Product Family L].[TV]}
  CATEGORY MEMBER [Stats].[Filter].[All-M].[TV without Seefood] AS SubCubeMinus( [Product].[Product Family].[Product Family L].[TV], [Product].[Product Family].[Product Family L].[Seafood] )
  CATEGORY MEMBER [Stats].[Filter].[All-M].[TV without VideoGames] AS SubCubeMinus( [Product].[Product Family].[Product Family L].[TV], [Product].[Product Family].[Product Family L].[Video Games] )
  CATEGORY MEMBER [Stats].[Filter].[All-M].[TV without Cheese] AS SubCubeMinus( [Product].[Product Family].[Product Family L].[TV], [Product].[Product Family].[Product Family L].[Cheese] )
SELECT
  [Measures].members ON COLUMNS,
  [Stats].[Filter].[All-M].children ON ROWS
FROM [Cube]

In our next post, we’re going to use object-oriented extensions to make some calculations. See you in Chapter III, presenting object-oriented icCube extensions.

By David Alvarez