When the Data Warehouse was introduced, it brought a simple
and consistent dimensional design. Reporting tools benefited from the
simplicity and consistency of the underlying database structures and prospered.
Operational systems were happy to specialize in the transactional area and shred
other responsibilities somewhere else. So any representation that did not fit
on a single screen was delegated either to a dimensional database provided by
the vendor, or to a stand-alone data warehouse.
The analytical groups grudgingly accepted the Data Warehouse.
The dimensional design did not match the statistical packages internal
structures. But there were no alternatives for getting the data. And additional
benefit was that the data was already cleansed and verified.
The landscape starts to change with the proliferation of the
in-memory databases and Big Data solutions.
The in-memory database technology improves performance. And
that spare database speed can be used to solve new problems or to allow for
less optimal designs for existing problems. As such it tolerates to report
directly from operational data structures not optimized for reporting. Nowadays
the ERP systems, such as SAP HANA, can take on more reporting. And if ERP is
the main data source for a Data Warehouse, the remaining sources can be easily absorbed
by the ERP system and eliminate the Data Warehouse completely.
Though the Big Data is just a vague marketing term, it
involves a lot of important database innovation. A distributed file system that
underlies Hadoop that can be had inexpensively. Soon it will start to divert data
and resources from the Data Warehouse. Of course, it is not a zero-sum game;
the Big Data reservoirs will also get new data sources: device-generated,
medical records, social network, etc. However it will be a loss for the DW.
So the traditional Data Warehouse will be squeezed from two
directions: less data extracted from operational databases, and more data diverted
to the Big Data reservoirs instead of DW.