Categories
DWB - Our Approach Your Edge Blog

Building Data Warehouse and Business Intelligence

Our approach to build a Data Warehouse and Business Intelligence solution is based on two fundamental attitudes – elaborating business information requirements (Enterprise Information Modeling) and delivering business value as soon as possible (Hybrid Approach to Build DWH).

Enterprise Information Modeling

Modeling business information requirements aims to translate business users’ informational expectations (required queries, reports and inputs for applications) to a central comprehensive semantic model, called Enterprise Information Model (EIM), on one hand capturing business definitions and business rules, mapped to business requirements, uses cases, and processes, and on the other hand providing a semantic skeleton to which various physical artefacts (representations of business definitions and rules, e.g., tables, transformations) are linked. The Enterprise Information Model is used in all phases of building a Data Warehouse and business Intelligence solution: capturing business domains, entities, attributes, and their definitions during the Analysis phase, capturing data mappings and Data Quality controls during the Design phase, providing the inputs for developers and partial automation of development during the Development phase and providing inputs to elaborate test cases and scenarios during the Testing phase. 

We recommend an agile approach of building a Data Warehouse and Business Intelligence solution and discourage from a water fall approach. Thus, the following activities are not conducted completely in the following order but rather in parallel, on selected (business) parts of the total solution:

  1. Requirements Analysis – analysis of required reports, application inputs and overall business information requirements. 
  2. Establishing Enterprise Information Model – elaborating its entities and attributes, taking care of precise and agreed business definitions (Business Dictionary).
  3. Identification of Physical Source Data Models – physical models of data sources or already existing systems, tables, and attributes.
  4. Source Data Profiling and Analysis – basic business understanding of data sources or already existing data (including reference data, including cardinalities and volumes).
  5. High-level Source Mapping – mapping sources or already existing systems, tables and attributes to entities and attributes of the Enterprise Information Model; without detailed transformation formulas, just identification, e.g., “attribute a is needed for EIM.b”.
  6. Core Logical Data Modeling – translating Enterprise Information Model into a logical data model; design of exact data types and primary/foreign key relationships of the Core model.
  7. Data Mart Logical Data Modeling – translating requirements, reports and required application inputs to their logical data models.
  8. Physical Data Modeling – translating logical data models of the Core and Data Marts into their physical representations (including technical layers, e.g., for historization), respecting data volumes, database systems and non-functional requirements.
  9. Detailed Mapping – mapping data sources, the Core structures and Data Marts to the attributes of the Enterprise Information Model at the detailed, physical data model, levels; including detailed transformation formulas, e.g., “attribute EIM.b = T.x join S.y on … where”.
  10. Data Structures Development – implementing and deploying physical database structures based on Physical Data Models.
  11. ETL Design and Development – designing and developing ETL components for loading the Core from data sources and the Data Marts from the Core, based on the Detailed Mappings and Data Structures; whenever efficient, partially automated.

Hybrid Approach to Build DWH 

In general, there are two extreme ways to build a Data Warehouse (DWH) – a “top-down approach”, associated with Bill Inmon, and a “bottom-up approach”, associated with Ralph Kimball. Simply put, the “top-down approach” proposes to create a consistent and standardized Core of a DWH first, the Core to be used as the “source of truth” for all data marts and business needs. Simply put, the “bottom-up approach” proposes to start the other way around and focus first at the (dimensional) data marts to deliver business value as soon as possible while sacrificing consistency and standardization. We recommend a “hybrid approach” which uses the best from the both extreme ways. 

The “hybrid approach” focuses on rapid delivery of business value while not sacrificing data consistency, standardization, and reusability. The high-level steps to build a DWH using the “hybrid approach” comprise: 

  1. Logical Core Data Model Creation – using the approach, i.e., derived from the EIM, invest a few weeks to build a logical data model of the Core. 
  2. Logical Data Mart Data Models Creation – using the approach, i.e., translating requirements, reports, application inputs, build logical data models (3NF) of a few “overlapping” data marts.
  3. Data Marts to Core Logical Data Mapping – map the data marts to the Core at a detailed level to uncover the Core attributes being “shared” by the data marts.
  4. Dimensional Data Marts Deployment – while having the logical data mart models, create dimensional models (star schemas) as well and deploy the data marts.
  5. Build Core Data Layer – implement and load the Core data layer, including DQ, standardization and consolidation.
  6. Redundancy or Inconsistency Treatment – identify the same or “similar” (inconsistent) data fed to several data marts (using the Mapping) and “reconnect” these data marts to the Core.