Data Mart?

A Data Mart is a focused, smaller version of a data warehouse designed to serve the specific needs of a single business department or team.


📦 What Is a Data Mart?

A data mart is a subset of a data warehouse that contains department-specific data, organized for fast querying and analysis.

  • Think of a data warehouse as a full library 📚

  • And data marts as department-specific bookshelves 📖 for HR, Sales, Finance, etc.


🧠 Key Characteristics

Feature

Description

🎯 Subject-Oriented

Designed for a specific business function (e.g., Sales, Marketing)

🚀 Faster Access

Optimized for performance on a smaller data set

📊 Simplified Schema

Often uses dimensional models like Star Schema or Snowflake Schema

🔄 Periodic Loads

Typically updated from the central warehouse on a scheduled basis


🧱 Types of Data Marts

Type

Description

Dependent

Built directly from a centralized data warehouse (most common)

Independent

Created directly from source systems, not tied to a data warehouse

Hybrid

Combines data from a warehouse and other sources


🔁 Example Use Case

Scenario:

The Sales team needs to analyze quarterly revenue, lead conversion, and regional performance.

Solution:

A Sales Data Mart could contain:

  • Tables like: sales_fact, product_dim, customer_dim, time_dim

  • Pre-aggregated metrics: revenue by region, monthly sales trends, etc.

  • Used in: Dashboards, Power BI reports, KPI tracking


✅ Benefits

  • 💨 Faster performance due to smaller scope

  • 🔐 Improved data access control for team-specific data

  • 🧩 Modular—easy to scale and develop incrementally

  • 🎓 User-friendly for non-technical analysts


Let me know if you'd like an ER diagram, sample schema, or SQL query example based on a data mart!

Great! Below is a sample Data Mart designed for a Sales Department, with:

  • ER Diagram-style structure

  • Sample schema for each table

  • Example SQL queries to use the data mart


🧱 Sales Data Mart – Schema (Star Schema)

                     +----------------+
                     |   Time_Dim     |
                     +----------------+
                            ↑
                            |
+-------------+     +----------------+      +--------------+      +------------------+
| Product_Dim | ←── |   Sales_Fact   | ───→ | Customer_Dim | ───→ | Region_Dim       |
+-------------+     +----------------+      +--------------+      +------------------+
                            ↑
                            |
                     +---------------+
                     |  Employee_Dim |
                     +---------------+

🗃️ Table Definitions

1. Sales_Fact (Fact Table)

Stores measurable metrics (facts) like revenue and quantity sold.

CREATE TABLE Sales_Fact (
    sale_id INT PRIMARY KEY,
    time_id INT,
    product_id INT,
    customer_id INT,
    employee_id INT,
    region_id INT,
    units_sold INT,
    total_revenue DECIMAL(10, 2)
);

2. Product_Dim

Product catalog information.

CREATE TABLE Product_Dim (
    product_id INT PRIMARY KEY,
    product_name VARCHAR(100),
    category VARCHAR(50),
    price DECIMAL(8, 2)
);

3. Customer_Dim

Customer demographic and business type info.

CREATE TABLE Customer_Dim (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(100),
    segment VARCHAR(50),
    email VARCHAR(100)
);

4. Employee_Dim

Salesperson or support staff involved in the transaction.

CREATE TABLE Employee_Dim (
    employee_id INT PRIMARY KEY,
    employee_name VARCHAR(100),
    position VARCHAR(50)
);

5. Time_Dim

Date breakdown for reporting.

CREATE TABLE Time_Dim (
    time_id INT PRIMARY KEY,
    date DATE,
    month VARCHAR(20),
    quarter VARCHAR(10),
    year INT
);

6. Region_Dim

Geographical data for customer location.

CREATE TABLE Region_Dim (
    region_id INT PRIMARY KEY,
    region_name VARCHAR(100),
    country VARCHAR(100)
);

📊 Sample SQL Queries

1. Total Revenue by Region (Q1 2025)

SELECT r.region_name, SUM(s.total_revenue) AS total_sales
FROM Sales_Fact s
JOIN Region_Dim r ON s.region_id = r.region_id
JOIN Time_Dim t ON s.time_id = t.time_id
WHERE t.year = 2025 AND t.quarter = 'Q1'
GROUP BY r.region_name;

2. Top 5 Products by Sales Revenue

SELECT p.product_name, SUM(s.total_revenue) AS revenue
FROM Sales_Fact s
JOIN Product_Dim p ON s.product_id = p.product_id
GROUP BY p.product_name
ORDER BY revenue DESC
LIMIT 5;

3. Monthly Sales by Salesperson

SELECT e.employee_name, t.month, SUM(s.total_revenue) AS monthly_sales
FROM Sales_Fact s
JOIN Employee_Dim e ON s.employee_id = e.employee_id
JOIN Time_Dim t ON s.time_id = t.time_id
GROUP BY e.employee_name, t.month
ORDER BY e.employee_name, t.month;

Let me know if you'd like this as a downloadable ERD image, or want a sample dataset to try locally or in SQL-based tools like Snowflake, BigQuery, or PostgreSQL!

Updated on