Formal approach to modelling a multiversion data warehouse

2006 
A data warehouse (DW) is a large centralized database that stores data integrated from multiple, usually heterogeneous external data sources (EDSs). DW content is processed by so called On-Line Analytical Processing applications, that analyze business trends, discover anomalies and hidden dependencies between data. These applications are part of decision support systems. EDSs constantly change their content and often change their structures. These changes have to be propagated into a DW, causing its evolution. The propagation of content changes is implemented by means of materialized views. Whereas the propagation of structural changes is mainly based on temporal extensions and schema evolution, that limits the application of these techniques. Our approach to handling the evolution of a DW is based on schema and data versioning. This mechanism is the core of, so called, a multiversion data warehouse. A multiversion DW is composed of the set of its versions. A single DW version is in turn composed of a schema version and the set of data described by this schema version. Every DW version stores a DW state which is valid within a certain time period. In this paper we present: (1) a formal model of a multiversion data warehouse, (2) the set of operators with their formal semantics that support a DW evolution, (3) the impact analysis of the operators on DW data and user analytical queries. The presented formal model was a basis for implementing a multiversion DW prototype system.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    32
    References
    10
    Citations
    NaN
    KQI
    []