MDX Tutorial (Gentle Introduction)
This tutorial describes MDX and its main concepts. It is meant to be a gentle introduction and is targeted to any person who wants to get a quick grasp of MDX capabilities.
MDX stands for 'Multi-Dimensional Expressions' and is the standard language defined by Microsoft to query OLAP servers.
At first glance, it may appear similar to SQL. However, MDX is a completely new language. SQL was designed to query dimensional data structures, called tables, where data are organized in rows and columns. In OLAP, data are organized around multiple measures, dimensions, hierarchies, and levels.
MDX is a language used to perform calculations and analysis around OLAP structures. MDX includes a rich set of functions for 'performing' statistical analysis. Unlike SQL, MDX does not have DDL (Data Definition) or DML (Data Manipulation) capabilities. OLAP structures are defined and modified in XMLA (XML for Analytics). MDX is purely for analyzing and reading data.
As OLAP servers are mainly using relational databases as their sources of data, we will sometimes use SQL concepts to describe functionality.
Similar to the way tables and columns are central to SQL, dimensions, hierarchies, and levels, are the centerpieces of MDX. They are mapping business models into language-specific concepts (e.g. a list of countries will be mapped as an MDX dimension).
The most natural way to explain these concepts is with an example.
Let's introduce a classic sales related business problem. We would like to generate some charts on our sales department and the information we have is as follows:
- Pierre Dupont sold one corporate license 2009 Q1 in Geneva
- Pierre Dupont sold one corporate license 2009 Q3 in Geneva
- Pierre Dupont sold one corporate license 2010 Q1 in Paris
- Rosa Maza sold one corporate license 2009 Q2 in Madrid
- Rudolf von Richthofen sold one partnership license 2009 Q1 in Zurich
- Rudolf von Richthofen sold one partnership license 2009 Q3 in Zurich
- John Bin sold one corporate license 2009 Q2 in New York
- Patty Bing sold one corporate license 2009 Q4 in Los Angeles
With this example we've introduced the concepts of 'Sales People', 'License', 'Time', 'City', and 'amount of licenses'. In a relational database we could model this with one table per concept and foreign keys in a 'amount_of_licenses' table.
In MDX 'Sales People', 'License', 'Time', and 'City', will be modeled as dimensions and the 'amount of licenses' as a measure. Measures in MDX are a special case of dimensions and they hold numeric values.
The concrete value of a dimension (e.g. Paris) is a member of the dimension and is similar to the value of an SQL table.
An OLAP cube, then, is a collection of dimensions indexing a list of measures.
You should not try to visualize the OLAP cube as a geometrical cube. A geometrical cube can have only three dimensions, whereas an OLAP cube can have many dimensions. In our example above, we have four dimensions ('Sales People', 'License', 'City', and 'Time') and one numeric measurable quantity ('License Quantity Sold'). Of course, real-life business problems can have many more dimensions and measures.
To start, let's look at an MDX statement that gives us a table with the amount of licenses sold per country per year:
SELECT ([Geography].[Geo].[Country].members) ON 0, ([Time].[Year].members ) ON 1 FROM [Sales]
As we would like to compute some statistics on this data, we will enrich the information with additional details.
We would like to organize the countries as a geographical tree (Continent, Country, City) and economic partnerships (EU, NAFTA, None).
For the time dimension, we're interested in years and quarters.
To continue with this example we will introduce the concepts of dimension, hierarchy, and levels, in the next chapter.
( Do you want to try this out live? Our Community Edition is available here. )