In this post we’re going to explain one of the latest functions in icCube , sumOn / VectorK , and how it can be used.
This start with the use case in the healthcare sector that was tricky to solve. It’s simplification of the actual scenario but we have data about the stay in a hospital of a list of patients that look like :
Stay | PatiendId | Departement | Traitement | Cost |
---|---|---|---|---|
1 | 1 | Cardiology | Ex | 75 |
1 | 1 | Radiology | IRM | 750 |
1 | 1 | Cardiology | Op | 150 |
2 | 2 | Psychiatry | Ex | 150 |
We want to be able to calculate the cost of the whole stay using as filter Department and Traitment.
The standard solution in MDX would be to filter the Stay members and afterwards calculate the stay cost (it’s not the traitement cost), something like :
WITH // we need to reset the filter so we have all the traitements for a stay MEMBER [Stay Cost] as ( [Patient].defaultMember, [Department].defaultMember, [Traitement].defaultMember, [Measures].[Cost] ) SELECT [Measures].[Stay Cost] on 0, NonEmpty( [Stay].[Stay].[Stay - L] , [Measures].[Cost] ) on 1 FROM [Cube] WHERE [Department].[Department].[Department].[Radiology]
This will work for small schemas and when the [Stay] dimension is not very large but it’s going to get worst when we have millons of rows and have an increasing number of stays.
One solution would be to create a fact table that uses many-to-many relations (check ‘GroupBy on Stay’ view in the schema enclosed at the end of this post).
Stay | Departement | Traitement | Cost |
---|---|---|---|
1 | [Cardiology Radiology Cardiology] | [Ex IRM Op] | 975.0 |
2 | [Psychiatry] | [Ex] | 150.0 |
3 | [Oncology Laboratory Laboratory Radiology Oncology] | [Ex GF1 FG2 IRM Op] | 7365.0 |
If this is possible, go for it. In our case this was not possible as we wanted to be able to filter on multiple dimensions using an AND relation. For example, we want to get the stays that had a row with Cardiology AND Ex traitement. The many-to-many relation answers to the question Cardiology OR Ex, so it can not be used as you can see in the table this information was lost in the transformation. We could indeed had created a composed dimension ( Department + Treatment ) but this was not ideal as we wanted the end users to be able to create their own reports with easy filters.
So eventually, the solution is a mix of icCube features based around the Vector aggregation. As a summary of the solution: the new calculation is going to get a measure that returns a vector with unique stays and we will use this vector with the stays to calculate the cost. How ?
a) Create a Measure with ‘Vector Unique’ as aggregation method on the stayId column -> [V – Stay]
b) Create a Vector that will hold for each member id the Stay Cost at the position defined by the id of the member (that has to be an integer). For example, [Stay 23] member that has an id of 23 will put his cost at the position 23 of the vector. To be memory efficient there should be no holes in the stayId values and they should start with 0, so the vector would be as small as possible.
We can already check our ideas :
WITH // create a vector with the cost by stay for each stay STATIC FUNCTION VStayCost() as VectorK( [Stay].[Stay].[Stay - L] , [Measures].[Cost] ) // This will sum for each stay defined in the measure it's cost MEMBER [Stay Cost2] as SumOn( VStayCost() , [Measures].[V - Stay ] ) SELECT { [Measures].[Stay Cost2], [Measures].[Stay Cost] } on 0, NonEmpty( [Stay].[Stay].[Stay - L] , [Measures].[Cost] ) on 1 FROM [Cube] WHERE [Department].[Department].[Department].[Radiology]
STATIC FUNCTION allows us to evaluate only once ignoring the current cell context: filters, subselects and where clause.
VectorK, is a function that evalautes the expression and sets the value in the position defined by the member key. SumOn allows to sum to vectors, the first one defined the values to sum, the second one the positions to sum.
As our final detail, we move the Function and the measure declaration to the schema script definition and it’s done. Our robust and amazingly fast new calculation is there.
You can download the self-contained schema from here.
Hope you enjoyed !
By David Alvarez