501 — Build an E-Commerce Analytics Platform

Advanced

Design, build, and query a complete e-commerce analytics database from scratch using PostgreSQL and Docker. This is a separate capstone project with its own richer schema, fresh container, and fresh dataset — not the smaller `devops_sql` schema from Lessons 402-405.

Learning Objectives

1
Design and create a normalized schema for a complete e-commerce system
2
Set up PostgreSQL via Docker and load realistic data
3
Write complex analytical queries using CTEs and window functions
4
Define result grain before aggregating so metrics stay correct
5
Create views for common business reports
6
Add indexes and verify performance with EXPLAIN ANALYZE
7
Implement audit triggers for data change tracking
8
Write monitoring queries for database health
9
Backup and restore the database
Step 1

Launch PostgreSQL via Docker

Spin up a fresh PostgreSQL 16 instance in a Docker container with a named volume for data persistence. This capstone intentionally uses a brand-new container and schema so you can build a production-style analytics project from scratch.

Commands to Run

docker run -d --name ecommerce-db -e POSTGRES_USER=analytics -e POSTGRES_PASSWORD=analytics123 -e POSTGRES_DB=ecommerce -p 5432:5432 -v ecommerce_pgdata:/var/lib/postgresql/data postgres:16-alpine
docker ps --filter name=ecommerce-db --format 'table {{.Names}}\t{{.Status}}\t{{.Ports}}'
docker exec -it ecommerce-db psql -U analytics -d ecommerce -c 'SELECT version();'

What This Does

We launch PostgreSQL 16 on Alpine Linux (small image) inside Docker. The -e flags set the superuser, password, and default database. Port 5432 is mapped to localhost so external tools can connect. The named volume ecommerce_pgdata ensures data survives container restarts. All subsequent commands use 'docker exec' to run psql inside this container. This capstone is intentionally separate from the `pg-sql-course` container used earlier, because the schema here includes additional analytics-friendly columns like `sku`, `cost`, `stock_qty`, `shipping_cost`, and richer review data.

Expected Outcome

Container 'ecommerce-db' running and healthy. The SELECT version() query returns something like 'PostgreSQL 16.x on ... compiled by ...'.

Pro Tips

  • 1
    If port 5432 is already in use, change -p to 5433:5432 and adjust later commands
  • 2
    Named volumes are preferred over bind mounts for database data
  • 3
    Alpine images are ~80 MB vs ~400 MB for full Debian-based images
  • 4
    You can connect GUI tools (pgAdmin, DBeaver) to localhost:5432 with user 'analytics'

Common Mistakes to Avoid

  • ⚠️Forgetting the -d flag (container runs in foreground and blocks your terminal)
  • ⚠️Using a bind mount path that doesn't exist (causes permission errors)
  • ⚠️Not removing a previous container with the same name — run 'docker rm -f ecommerce-db' first if needed
Was this step helpful?

All Steps (0 / 15 completed)