Data Mining Discussion 3 b


Full article

What do we understand by “multidimensional data model”? What is a “data cube”?

A multidimensional data model is a model for databases which are generally themed (e.g. sales) and have multiple dimensions of study like time, item, branch, location, etc. These dimensions, in turn, often have dimensions of their own. Item could potentially have a dimension item_name, or brand. A data cube allows data to be modeled and viewed in multiple dimensions. It is a multidimensional data model.

Explain in your own words the following concepts and use an example to illustrate your explanations: snowflake schema, fact constellation, and star schema.

Star schema: This schema uses a data warehouse with a large fact table which contains the bulk, unique data and a set of dimension tables. To picture it, the schema represents a starburst, with the dimension tables displayed in a radial pattern around the central fact table.

Snowflake schema: A snowflake schema is a variant of the star schema. Some dimension tables are normalized in this schema, which splits the data into additional tables. The resulting graph ends up forming a shape like that of a snowflake.

Fact constellation: A fact constellation lets you share dimension tables between fact tables. This schema can be imagined as a collection of stars (hence the name fact constellation).

What is a data cube measure? Any examples?

A data cube measure is a numeric function that can be evaluated at each point in the data cube space. A value is computed for a given point by aggregating the data corresponding to the respective dimension. An example of a data cube measure is a distributive aggregate function.

Explain and provide an example of an OLAP operation for multidimensional data.

OLAP operations exist to materialize the different views available in a multidimensional model. This allows for interactive querying and analysis of the data at hand. An example of an OLAP operation would be the roll-up operation. The roll-up operation performs aggregation on a data cube, either by climbing up a concept hierarchy for a dimension, or by dimension reduction.