Real-Time Data Sources (aka. Incremental Load)

Real-Time data sources extends regular data sources to support incremental updates of the cubes. Note that the incremental load feature is not related to any type of data source; indeed you can incrementally load data from relational DB, from CSV files, etc...

Table Loading Strategy

When defining a new data table in your data source, you have the option to define a loading strategy:

  • Static : the strategy meaning no incremental load is performed. The table is read once at cube creation time and not accessed anymore.
  • Incremental Load : the table is fully read once at cube creation time and then is accessed from time to time to check for new data available. Its actual access might depend on the trigger tables (see below for more details).
  • Incremental Load (Trigger) : the table is fully read once at cube creation time and then is accessed from time to time to check for new data available. This kind of tables allow to trigger the actual check of regular incremental tables (see below for more details).
  • Full reload : the table is fully read once at cube creation time and then fully reload on each incremental load. This mode is only compatible if the table is used exclusively for defining a dimension. The table is only read during an incremental load if new data is available (defined by another Incremental Load table).

Incremental Load

An incremental load table needs to define a new column called incremental column hereafter. This column allows to detect any changes from the previous load command. Therefore, this is for example an SQL auto-incremented ID; but any other unique value incremented with each new row is fine.

The initial first load of the schema is performed as before; that is, the table is fully read. subsequent loads will then be done incrementally by looking at the value of the incremental column to detect any changes from the previous load. Note that all the changes to all the incremental tables should be consistent as a whole (e.g., new facts referencing new members) to prevent generating errors when updating the schema.

Incremental Load (Trigger)

Trigger tables are very similar to incremental tables. The difference is that trigger tables are incrementally loaded first and if any changes is detected in these tables, the other (that is the tables with no trigger) incremental tables will be checked for new data. Checking the trigger table only is more likely much more efficient regarding the speed (SQL access). This allows as well for example to ensure consistency changes between several tables: each table can be updated and when the whole change is consistent, the trigger table is updated.

Incremental Load and Many-To-Many relations limitation

An incremental load when updating a many-to-many relation will NOT change the already loaded fact associations. For example, when a fact row linked to member ONE had a many-to-many relation with member A,B and incremental load that adds the association between ONE and C will not update existing fact rows. So, C will not be linked with the already loaded facts linked to ONE. This is valid for hierarchies, many-to-many and bridge structures.

Full Load vs. Incremental Load

The UI (and scheduler) is supporting two types of load schema commands:

  • Full Load : whether or not a schema supports incremental tables, a full load (and possibly re-load) of the schema is performed.
  • Incremental Load : only updates are applied to a loaded schema; if the schema is not loaded yet a full load is being performed.

If an error occurs during an incremental load, the schema is switched to a state where the incremental load commands are de-activated. It is then highly recommended to perform a full reload as the schema is in an uncertain state (possibly inconsistent). Note that this kind of switch can be externally (e.g., eMail) notified (see the notification service for more details).

Scheduler

The scheduler is now supporting incremental load vs. full load commands. Note that the both commands can be scheduled to for example perform a full reload at night and then incremental loads during working hours.

Walkthrough

For a concrete working example please have a look to the following page.