Abstract
Data warehousing and on-line analytical processing (OLAP)
are essential elements of decision support, which has
increasingly become a focus of the database industry. Many
commercial products and services are now available, and all
of the principal database management system vendors now
have offerings in these areas. Decision support places some
rather different requirements on database
technology
compared to traditional on-line transaction processing
applications. This paper provides an overview of data
warehousing and OLAP technologies, with an emphasis on
their new requirements. We describe back end tools for
extracting, cleaning and loading data into a data warehouse;
multidimensional data models typical of OLAP; front end
client tools for querying and data analysis; server extensions
for efficient query processing; and tools for metadata
management and for managing the warehouse. In addition to
surveying the state of the art, this paper also identifies some
promising research issues, some of which are related to
problems that the database research community has worked
on for years, but others are only just beginning to be
addressed. This overview is based on a tutorial that the
authors presented at the VLDB Conference, 1996.
1. Introduction
Data warehousing is a collection of decision support
technologies, aimed at enabling the knowledge worker
(executive, manager, analyst) to make better and faster
decisions. The past three years have seen explosive growth,
both in the number of products and services offered, and in
the adoption of these technologies by industry. According to
the META Group, the data warehousing market, including
hardware, database software, and tools, is projected to grow
from $2 billion in 1995 to $8 billion in 1998. Data
warehousing technologies have been successfully deployed in
many industries: manufacturing (for order shipment and
customer support), retail (for user profiling and inventory
management), financial services (for claims analysis, risk
analysis, credit card ana