A Data Mart is a focused, smaller version of a data warehouse designed to serve the specific needs of a single business department or team.
📦 What Is a Data Mart?
A data mart is a subset of a data warehouse that contains department-specific data, organized for fast querying and analysis.
-
Think of a data warehouse as a full library 📚
-
And data marts as department-specific bookshelves 📖 for HR, Sales, Finance, etc.
🧠 Key Characteristics
Feature | Description |
|---|---|
🎯 Subject-Oriented | Designed for a specific business function (e.g., Sales, Marketing) |
🚀 Faster Access | Optimized for performance on a smaller data set |
📊 Simplified Schema | Often uses dimensional models like Star Schema or Snowflake Schema |
🔄 Periodic Loads | Typically updated from the central warehouse on a scheduled basis |
🧱 Types of Data Marts
Type | Description |
|---|---|
Dependent | Built directly from a centralized data warehouse (most common) |
Independent | Created directly from source systems, not tied to a data warehouse |
Hybrid | Combines data from a warehouse and other sources |
🔁 Example Use Case
Scenario:
The Sales team needs to analyze quarterly revenue, lead conversion, and regional performance.
Solution:
A Sales Data Mart could contain:
-
Tables like:
sales_fact,product_dim,customer_dim,time_dim -
Pre-aggregated metrics: revenue by region, monthly sales trends, etc.
-
Used in: Dashboards, Power BI reports, KPI tracking
✅ Benefits
-
💨 Faster performance due to smaller scope
-
🔐 Improved data access control for team-specific data
-
🧩 Modular—easy to scale and develop incrementally
-
🎓 User-friendly for non-technical analysts
Let me know if you'd like an ER diagram, sample schema, or SQL query example based on a data mart!
Great! Below is a sample Data Mart designed for a Sales Department, with:
-
ER Diagram-style structure
-
Sample schema for each table
-
Example SQL queries to use the data mart
🧱 Sales Data Mart – Schema (Star Schema)
+----------------+
| Time_Dim |
+----------------+
↑
|
+-------------+ +----------------+ +--------------+ +------------------+
| Product_Dim | ←── | Sales_Fact | ───→ | Customer_Dim | ───→ | Region_Dim |
+-------------+ +----------------+ +--------------+ +------------------+
↑
|
+---------------+
| Employee_Dim |
+---------------+
🗃️ Table Definitions
1. Sales_Fact (Fact Table)
Stores measurable metrics (facts) like revenue and quantity sold.
CREATE TABLE Sales_Fact (
sale_id INT PRIMARY KEY,
time_id INT,
product_id INT,
customer_id INT,
employee_id INT,
region_id INT,
units_sold INT,
total_revenue DECIMAL(10, 2)
);
2. Product_Dim
Product catalog information.
CREATE TABLE Product_Dim (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50),
price DECIMAL(8, 2)
);
3. Customer_Dim
Customer demographic and business type info.
CREATE TABLE Customer_Dim (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
segment VARCHAR(50),
email VARCHAR(100)
);
4. Employee_Dim
Salesperson or support staff involved in the transaction.
CREATE TABLE Employee_Dim (
employee_id INT PRIMARY KEY,
employee_name VARCHAR(100),
position VARCHAR(50)
);
5. Time_Dim
Date breakdown for reporting.
CREATE TABLE Time_Dim (
time_id INT PRIMARY KEY,
date DATE,
month VARCHAR(20),
quarter VARCHAR(10),
year INT
);
6. Region_Dim
Geographical data for customer location.
CREATE TABLE Region_Dim (
region_id INT PRIMARY KEY,
region_name VARCHAR(100),
country VARCHAR(100)
);
📊 Sample SQL Queries
1. Total Revenue by Region (Q1 2025)
SELECT r.region_name, SUM(s.total_revenue) AS total_sales
FROM Sales_Fact s
JOIN Region_Dim r ON s.region_id = r.region_id
JOIN Time_Dim t ON s.time_id = t.time_id
WHERE t.year = 2025 AND t.quarter = 'Q1'
GROUP BY r.region_name;
2. Top 5 Products by Sales Revenue
SELECT p.product_name, SUM(s.total_revenue) AS revenue
FROM Sales_Fact s
JOIN Product_Dim p ON s.product_id = p.product_id
GROUP BY p.product_name
ORDER BY revenue DESC
LIMIT 5;
3. Monthly Sales by Salesperson
SELECT e.employee_name, t.month, SUM(s.total_revenue) AS monthly_sales
FROM Sales_Fact s
JOIN Employee_Dim e ON s.employee_id = e.employee_id
JOIN Time_Dim t ON s.time_id = t.time_id
GROUP BY e.employee_name, t.month
ORDER BY e.employee_name, t.month;
Let me know if you'd like this as a downloadable ERD image, or want a sample dataset to try locally or in SQL-based tools like Snowflake, BigQuery, or PostgreSQL!