Data Mining Discussion 3 a

  • What is a data warehouse?

There are many definitions as to what a data warehouse is, but as per the book, a data warehouse is a consistent data store that servers as a physical implementation of a decision support data model.

  • What are some of the differences between operational database systems and data warehouses?

A difference is that operational databases, or OLTP systems, are generally customer-oriented. This type of system is used for transaction and query processing. A data warehouse, or OLAP system, is market-oriented. It’s used for data analysis and generally assists executives in making decisions based on findings from that data. Another difference is that OLTP systems use current data while OLAP systems manage large amounts of historic data.

  • What is the rationale of constructing a separate data warehouse, when online analytical processing could be performed directly on operational databases?

The reason for this is to promote high performance of both systems. An OLTP system is used for known tasks and workloads while an OLAP system can often have complex queries that involve the computation of large data groups. Processing OLAP queries in an operational database would degrade the performance of those operational tasks.

  • What is a metadata repository and what are some of the elements it should contain?

A metadata repository is repository which contains data about data. A metadata repository should contain a description of the data warehouse structure, operational metadata, currency of data, and monitoring information, to name a few.