Modak Nabu™ enables enterprises to automate data ingestion, curation, and consumption processes at a petabyte-scale. https://modak.com/
What is a Data Fabric?
The terms Data Fabric and Data Mesh are now routinely used in the data management and engineering circles. Given
the hype and marketing, reaching an agreement on their definitions and usage patterns is proving difficult. The purpose
of this short paper is to provide clarity from an adoption perspective.
Data is dispersed throughout an enterprise in a variety of structures and formats, spanning numerous applications, data-
bases, data warehouses, and data lakes. The migration of on-premise data repositories to the cloud extends the data
landscape even more, and with Data Scientists requiring external data sets to continuously feed self-learning models, the
complexity of managing data is increasing exponentially. The need to think about new data management designs and
practices is now front and center in the industry.
A Data Fabric needs to be seen from a data management design
viewpoint, not from an implementation perspective. No single
solution can provide a comprehensive one-stop-shop to enable a
Data Fabric. Instead, multiple providers and consumers of data
need to be brought together focused on three core tenets for a
Data Fabric: agility, integration, and automation. These are
supported by using an active metadata repository to capture the
source technical and business metadata and visualized through
semantic knowledge graphs. A Data Fabric provides data engi-
neers and subject matter experts with the foundations to curate
and deliver data domain products.
The main objective of a Data Fabric is to provide a “net” that is cast
to stitch together multiple heterogeneous data sources and
types, through automated data pipelines that proliferate an
active metadata repository.
This allows for logical groupings (without moving the data) to create virtual data domains where augmentation techniques
to apply tags (for example classify PHI data) or ML algorithms can be applied to automate the data quality and cataloging
of data sets.
As such, a data fabric design is a