KB441285: Optimization for the Intelligent Cube Data Structure at the time of publication

Principal Product Specialist • MicroStrategy

In MicroStrategy Secure Enterprise an Intelligent Cube goes through an Attribute Index based optimization during the publication to produce the smallest effective size based on cardinality.

The Attribute Index

An Intelligent Cube is an In-Memory data structure which holds information from a source in a readily accessible format for quick access. Intelligent Cubes can be partitioned, created through Data Import, or hold a small standard data set depending on the needs for the situation. In order for all of the data to be hosted in the most efficient space to consume the least amount of memory one of the key optimizations that exists in our internal Attribute Index.

The Attribute Index is a mapping that we create during the initial publication of any Intelligent Cube type which stores information for how the Element Block structure should be built, and relations are created for minimizing the size of an Intelligent Cube. In order to build an Attribute Index the Intelligence Server analyzes the sum of all the populated data from the initial publication and determines what the cardinality of all the included attributes is within the set. This cardinality forms the basis for the ordering given:

For example an optimized Intelligent Cube will look like the following:

Attribute Index, Name, Partition Index, Cardinality, Number of Forms
0, Attribute_Name_0, -1, 3, 1
1, Attribute_Name_1, -1, 4, 1
2, Attribute_Name_2, -1, 5, 1
3, Attribute_Name_3, -1, 9, 1
4, Attribute_Name_4, -1, 16, 1
5, Attribute_Name_5, -1, 27, 1
6, Attribute_Name_6, -1, 31, 1
7, Attribute_Name_7, -1, 49, 1
8, Attribute_Name_8, -1, 54, 1
9, Attribute_Name_9, -1, 102, 1

For this Intelligent Cube we have 10 different Attributes included which do not have any partitioning, during the Intelligent Cube publication we found that there exists a total cardinality for each Attribute that was included within the publication. From there we ordered these Attributes in an ascending format to remove the repetition from higher level Attributes. This allows us to have 3 specific instances of the Attribute_Name_0 within the Element Blocks of the Data Structure, instead of if that was at the bottom of the ordering the instances would be repeated for each proceeding Attribute.

Determining the current cardinality

The Attribute Index that is built into the Intelligent Cube data structure is not available for view inside of the GUI elements of Strategy and are instead through the Dimensional Storage logs. These logs can be enabled through the Strategy Diagnostics and Configuration Tool and require an Intelligence Server restart before they take effect on the environment.

How to configure additional traces using Diagnostics Configuration Tool.

From this tool the required trace for reviewing the Intelligent Cube structure are enabled under both 'Machine Default' and 'CastorServer Instance' as the File Log for the Component 'Dimensional Storage' and Dispatcher 'Common Trace'. Following the restart of the Intelligence Server this log will capture information when loading Intelligent Cubes during the startup procedure as well as publication for how the structure is defined with an output similar to the following:

[Dimensional Storage][Common Trace] --- Show Cube Info ---
--- Show Cube Basic Info ---
Cube ID : D03E4FFA4342140F3938EF981D2096A1
Cube Name: Test_Intelligent_Cube
AE Cube Size = 36864 Bytes, 36 KB
Attribute Size = 4635 Bytes, 4 KB
Relation Size = 0 Bytes, 0 KB
Metric Size = 32 Bytes, 0 KB
Index Pool Size(containing element block and index space) = 32193 Bytes, 31 KB
Element Block Size = 4096 Bytes, 4 KB
Index Space Size = 1877 Bytes, 1 KB
Index Key Size = 0 Bytes, 0 KB
Search Tree Size = 0 Bytes, 0 KB
OriTable = 4 Bytes, 0 KB
--- Show Cube Attribute Info ---
Attribute Index,Attribute Name,Partition Index,Cardinality,Number of Forms
0,Month of Year,-1,2,2,(Short UTF8Char )
1,Year,-1,4,1,(Short )
2,Day,-1,237,1,(Date )
--- Total Attribute Count = 3 ---
--- Show Cube Relation Info ---
Relation Index,Relationship,Parent,Child,Partition Index,Relation Rows,Parent Cardinality,Child Cardinality
--- Total Relation Count = 0 ---
--- Show Cube Metrics Info ---
Metric Index,Metric Name,Partition,Data Rows,Distinct Data Rows,Data Type,Orig Data Type,Converted,Compressed,Potential Saving (bytes)
--- Total Metric Count = 0 ---
--- Show Cube Index Pools Info ---
Index Pool,Partition,Index,contained Attributes,Attribute ElementBlocks Number,rows,IndexPool Size(KB),ElementBlock Size(KB)
0,-1,0,Month of Year|Year|Day,2|8|237,237,8,4
--- Total Index Pools Count = 1 ---

Impact on Incremental Refresh Intelligent Cubes

With the growing size of Intelligent Cubes and datasets used for reporting, one method that is seen for refreshing large amounts of data is by using Incremental Refresh on an Intelligent Cube object. This will allow an already published Intelligent Cube to have new information merged into the dataset without having to go through an entire fresh republication.

In some situations where the time it will take for the republication or the resources for a full republication can not be accessible often, the Incremental Refresh represents an easy choice for bringing in new information without impacting the other procedures of the Intelligence Server. However, when choosing this method for managing Intelligent Cubes there is a possible impact from utilizing Incremental Refresh on Intelligent Cubes based on the design of the Intelligent Cube.

The Attribute Index and cardinality order is generated during the initial publication of an Intelligent Cube and will be optimized based on the available data at that time, if the Incremental Refresh brings in new data that changes the cardinality non-respective to the original then the Intelligent Cube will have a larger memory foot-print then a full republication.

For example we have two Intelligent Cubes, the first is a full publication of 10 years worth of data, and the second is created through a single year of data and then Incrementally Refreshed to match the first.

Full Publication(10 Years of Data):
Attribute Index, Attribute Name, Partition Index, Cardinality, Number of Forms
0, Attribute_Name_0, -1, 3, 1
1, Attribute_Name_1, -1, 4, 1
2, Attribute_Name_2, -1, 5, 1
3, Attribute_Name_3, -1, 9, 1
4, Attribute_Name_4, -1, 16, 1
5, Attribute_Name_5, -1, 27, 1
6, Attribute_Name_6, -1, 31, 1
7, Attribute_Name_7, -1, 49, 1
8, Attribute_Name_8, -1, 54, 1
9, Attribute_Name_9, -1, 102, 1

Initial Publication(1 Year of Data):
Attribute Index, Attribute Name, Partition Index, Cardinality, Number of Forms
0, Attribute_Name_3, -1, 1, 1
1, Attribute_Name_0, -1, 3, 1
2, Attribute_Name_1, -1, 4, 1
3, Attribute_Name_2, -1, 5, 1
4, Attribute_Name_6, -1, 15, 1
5, Attribute_Name_4, -1, 16, 1
6, Attribute_Name_5, -1, 27, 1
7, Attribute_Name_9, -1, 30, 1
8, Attribute_Name_7, -1, 49, 1
9, Attribute_Name_8, -1, 54, 1

Incremental Refresh(10 Years of Data to match first example):
Attribute Index, Attribute Name, Partition Index, Cardinality, Number of Forms
0, Attribute_Name_3, -1, 9, 1
1, Attribute_Name_0, -1, 3, 1
2, Attribute_Name_1, -1, 4, 1
3, Attribute_Name_2, -1, 5, 1
4, Attribute_Name_6, -1, 31, 1
5, Attribute_Name_4, -1, 16, 1
6, Attribute_Name_5, -1, 27, 1
7, Attribute_Name_9, -1, 102, 1
8, Attribute_Name_7, -1, 49, 1
9, Attribute_Name_8, -1, 54, 1

From this method of republication the final Incremental Refresh Intelligent Cube has Attribute 3 at the top of the order instead of Attribute 0. This causes repetition of the information from Attribute 0 as the Element Blocks in the structure must have 9 distinct blocks for Attribute 3 that contain all 3 points of data for Attribute 0 instead of 3 distinct blocks for just Attribute 0. This effect will have the ability to be worse depending on how small the initial publication set is, until the worst case scenario of publishing an Intelligent Cube with no initial information.

Due to the function of the optimization done by the In-Memory engine, Strategy recommends either conducting a Full Publication or when utilizing Incremental Refresh to populate a majority sampling of the final data that matches the overall cardinality. By containing a majority sampling the Intelligence Server will have enough information to create a Attribute Index which is respective of the final product to prevent improper ordering during this optimization step.
KB441285

Comment

0 comments

Details

Knowledge Article

Published:

June 29, 2018

Last Updated:

December 6, 2018