Designing a Modern Analytics Platform with Python, dbt, and Metabase
Building an enterprise-grade ELT pipeline for AI conversation analytics
This is Article 4 of a 7-part series on building a real-time data pipeline for Claude Code conversation logs.
- Article 1: System Overview and Architecture
- Article 2: Building the Real-Time Sync Service
- Article 3: Creating the Conversation Browser UI
- Article 4: Designing the Analytics Platform (You are here)
- Article 5: Python ETL with Prefect Orchestration
- Article 6: Data Modeling with dbt
- Article 7: Building Dashboards with Metabase
Introduction: Why Analyze Conversation Logs?
In Article 1, we built a sync service that captures Claude Code conversations in real-time, storing them in MongoDB. But raw data sitting in a database is like an unread book on a shelf. The real value emerges when we can answer questions: How are developers using the AI assistant? Which tools get invoked most frequently? What patterns indicate productive coding sessions?
This is where analytics transforms curiosity into insight.
Building an analytics platform on conversation logs presents unique challenges. The data is semi-structured (nested JSON with variable schemas), arrives continuously, and needs to support both operational dashboards and ad-hoc exploration. We need a system that handles incremental updates efficiently while maintaining data quality.
In this article, I will walk you through the high-level architecture of our analytics platform. We will explore the medallion pattern for data organization, understand why we chose each technology in our stack, and see how the pieces fit together. Think of this as the architectural blueprint. The subsequent articles will dive deep into each component: Python ETL in Article 5, dbt transformations in Article 6, and Metabase dashboards in Article 7.
Let’s start by understanding how we organize data as it flows through the system.
The Medallion Architecture: Bronze, Silver, and Gold
The medallion architecture is a data design pattern that organizes data into layers of increasing refinement. Picture a refinery: crude oil enters, and through successive processing stages, it becomes gasoline, plastics, and other refined products. Our data goes through a similar journey.
Bronze Layer: The Foundation
The bronze layer is where raw data lands after extraction from MongoDB. We perform minimal transformations here: type casting, null handling, and basic cleaning. The goal is to create a reliable, queryable version of the source data without losing any information.
Our bronze layer contains three staging models:
- stg_conversations: Core conversation metadata (session ID, project ID, timestamps)
- stg_messages: Individual messages with role and content
- stg_tool_calls: Tool invocations extracted from assistant responses
These models are materialized as views because they simply reshape the source data. No heavy computation happens here.
Silver Layer: Where Business Logic Lives
The silver layer is where we apply business rules and enrich the data. We join related entities, compute derived fields, and handle edge cases that the bronze layer exposes.
Three intermediate models form our silver layer:
- int_messages_enriched: Messages joined with session context, content length calculated, code blocks detected
- int_tool_usage: Tool calls with execution metadata and categorization
- int_sessions_computed: Sessions with calculated duration, message counts, and activity levels
The silver layer answers the question: “What does this data actually mean in our business context?”
Gold Layer: Ready for Consumption
The gold layer follows dimensional modeling principles, producing tables optimized for analytics queries. This is where we build the star schema that powers our dashboards.
Dimensions provide context:
dim_date: Calendar attributes for time-based analysisdim_projects: Project-level aggregations and metadatadim_sessions: Session statistics and classificationsdim_tools: Tool catalog with categories
Facts capture events:
fct_messages: One row per message with foreign keys to dimensionsfct_tool_calls: Tool usage events with timing and statusfct_file_operations: Code modification tracking
Aggregates pre-compute common queries:
agg_daily_summary: Daily metrics for trend analysisagg_hourly_activity: Hour-of-day patterns for heatmapsagg_session_metrics: Per-session productivity indicators
The gold layer tables are materialized as physical tables (not views) because they serve high-frequency dashboard queries. Pre-computing these aggregations dramatically improves dashboard performance.
Technology Stack Overview
Choosing the right tools for an analytics platform requires balancing several concerns: development velocity, operational simplicity, cost, and scalability. Here is why we selected each component of our stack.
+-----------------------------------------------------------+| VISUALIZATION LAYER || Metabase (Custom DuckDB) |+-----------------------------------------------------------+| TRANSFORMATION LAYER || dbt-core + dbt-duckdb (Medallion Architecture) |+-----------------------------------------------------------+| STORAGE LAYER || DuckDB (OLAP) + Parquet Files |+-----------------------------------------------------------+| ORCHESTRATION LAYER || Prefect 2.x + PostgreSQL Backend |+-----------------------------------------------------------+| EXTRACTION LAYER || Python 3.11 + PyMongo + PyArrow |+-----------------------------------------------------------+| SOURCE LAYER || MongoDB (Conversation Logs) |+-----------------------------------------------------------+Python for Extraction
Python is the natural choice for data extraction. PyMongo provides a mature MongoDB client, and PyArrow enables efficient Parquet file generation. We use Pydantic for configuration management, giving us type-safe settings with environment variable support.
The extraction layer handles:
- Incremental extraction using high water marks
- Batched processing (10,000 documents per batch)
- Date-partitioned Parquet output with Snappy compression
- Dead letter handling for malformed documents
Prefect for Orchestration
Prefect 2.x orchestrates our pipeline with a clean Python-native API. Unlike Airflow’s DAG-centric model, Prefect treats flows as regular Python functions decorated with @flow and @task. This makes testing and local development straightforward.
Key Prefect features we leverage:
- Retry with exponential backoff: Extraction retries 3 times with 30-second delays
- Work pools: Distribute execution across workers
- Scheduled deployments: Hourly incremental, daily full refresh
- Web UI: Monitor runs at
localhost:4200
DuckDB as the Analytical Database
DuckDB is an embedded OLAP database that runs in-process. Think of it as SQLite for analytics. This choice might seem unconventional compared to cloud data warehouses, but it offers compelling advantages for our use case:
- Zero infrastructure: No server to manage, just a file
- Columnar storage: Optimized for analytical queries
- Excellent Parquet support: Native reading with predicate pushdown
- SQL compatibility: Standard SQL with analytical functions
For a single-user or small-team analytics platform processing millions of rows, DuckDB delivers impressive performance without the complexity of distributed systems.
dbt for Transformations
dbt (data build tool) has become the industry standard for SQL-based transformations. It brings software engineering practices to analytics:
- Version-controlled SQL: Models live in Git alongside application code
- Dependency management: dbt builds models in the correct order
- Testing framework: Schema tests, custom tests, data freshness checks
- Documentation: Auto-generated data lineage and model docs
Our dbt project uses the dbt-duckdb adapter, which integrates seamlessly with our embedded database.
Metabase for Visualization
Metabase provides self-service BI capabilities without requiring SQL knowledge. Users can explore data through a point-and-click interface, while power users can write custom queries.
We run a custom Metabase image with the DuckDB driver pre-installed, enabling direct queries against our analytical database. The dashboard configuration lives in the repository, making it reproducible across environments.
Great Expectations for Data Quality
Data quality validation happens at multiple pipeline stages using Great Expectations. We define expectations (rules) for each layer:
- Bronze: Required fields present, valid data types
- Silver: Referential integrity, business rule compliance
- Gold: Aggregate consistency, freshness thresholds
When expectations fail, the pipeline alerts but continues processing. This prevents bad data from silently corrupting downstream models while avoiding complete pipeline failures for minor issues.
Module Organization
The analytics platform follows a clean separation of concerns. Each subdirectory has a single responsibility, making the codebase navigable and maintainable.
analytics/├── analytics/ # Python ETL package│ ├── __init__.py│ ├── cli.py # Typer CLI interface│ ├── config.py # Pydantic settings│ ├── extractor.py # MongoDB extraction│ ├── loader.py # DuckDB loading│ ├── quality.py # Great Expectations│ └── flows/ # Prefect orchestration│ ├── main_pipeline.py│ └── deployment.py├── dbt/ # Transformation models│ ├── dbt_project.yml│ ├── profiles.yml│ └── models/│ ├── staging/ # Bronze layer (3 models)│ ├── intermediate/ # Silver layer (3 models)│ └── marts/ # Gold layer (12 models)├── great_expectations/ # Data quality config├── metabase/ # Custom Metabase image├── Dockerfile # Multi-stage build├── docker-compose.analytics.yml├── Makefile # Operational commands└── prefect.yaml # Deployment definitionsThe Python Package (Article 5 Preview)
The analytics/ Python package contains our extraction and loading logic. Key classes include:
- MongoExtractor: Connects to MongoDB, queries with timestamp filters, writes Parquet
- DuckDBLoader: Creates schemas, loads Parquet files, manages indexes
- DataQualityValidator: Runs Great Expectations checkpoints
The CLI (cli.py) exposes these capabilities through a Typer interface, enabling both interactive use and scripted automation.
The dbt Project (Article 6 Preview)
The dbt/ directory contains 18 SQL models organized by layer:
| Layer | Models | Materialization |
|---|---|---|
| Staging (Bronze) | 3 | View |
| Intermediate (Silver) | 3 | View |
| Marts (Gold) | 12 | Table |
Each model includes documentation and tests defined in accompanying _schema.yml files.
Metabase Configuration (Article 7 Preview)
The metabase/ directory contains a custom Dockerfile that builds Metabase with the DuckDB driver. Dashboard configurations can be exported and version-controlled, enabling reproducible analytics environments.
Data Flow: From MongoDB to Dashboard
Understanding the complete data journey helps diagnose issues and plan optimizations. Here is how a conversation entry flows through our system:
Step 1: Extraction
The MongoExtractor queries MongoDB for documents newer than the last high water mark. Documents arrive in batches of 10,000, transformed into a flat structure suitable for columnar storage.
# Simplified extraction flowdef extract(self, full_backfill: bool = False) -> List[Path]: query = {} if full_backfill else {"ingestedAt": {"$gt": self.hwm.get()}}
for batch in self.client.find(query).batch_size(10000): records = [self.transform(doc) for doc in batch] parquet_path = self.write_parquet(records)
self.hwm.set(max_timestamp) return written_filesStep 2: Parquet Landing Zone
Extracted data lands in date-partitioned Parquet files:
/data/raw/├── date=2026-01-01/│ └── conversations_001.parquet├── date=2026-01-02/│ └── conversations_001.parquet└── ...Parquet provides excellent compression (typically 10x versus JSON) and enables predicate pushdown during queries.
Step 3-4: Loading into DuckDB
The DuckDBLoader reads Parquet files using glob patterns and upserts into the raw.conversations table:
INSERT OR REPLACE INTO raw.conversationsSELECT * FROM read_parquet('/data/raw/date=*/conversations_*.parquet', hive_partitioning=true);Five indexes support common query patterns: project_id, session_id, date, type, and timestamp.
Step 5-6: dbt Transformation
dbt builds models in dependency order. A single dbt run command:
- Reads from
raw.conversations - Builds staging views (bronze)
- Builds intermediate views (silver)
- Materializes mart tables (gold)
The entire transformation completes in seconds for typical daily volumes.
Step 7: Metabase Queries
Metabase connects directly to DuckDB and queries the gold layer tables. Pre-aggregated tables like agg_daily_summary power dashboard widgets without expensive real-time computation.
Infrastructure with Docker Compose
The analytics platform runs as six Docker services, orchestrated via docker-compose.analytics.yml. This setup provides isolation, reproducibility, and simple deployment.
Service Overview
| Service | Port | Purpose |
|---|---|---|
| prefect-server | 4200 | Workflow orchestration UI and API |
| prefect-db | - | PostgreSQL for Prefect metadata |
| prefect-worker | - | Executes scheduled flows |
| analytics-worker | - | CLI access and ad-hoc commands |
| metabase | 3001 | BI dashboards |
| metabase-db | - | PostgreSQL for Metabase metadata |
Key Configuration Patterns
Shared Volumes: Both workers and Metabase mount duckdb-data, enabling shared access to the analytical database. DuckDB handles concurrent reads gracefully.
Host Network Access: The extra_hosts directive maps host.docker.internal to the host gateway, allowing containers to reach MongoDB running on the host machine.
Health Checks: Prefect server and Metabase include HTTP health checks, enabling dependent services to wait for readiness.
# Example health check configurationhealthcheck: test: ["CMD-SHELL", "curl -f http://localhost:4200/api/health || exit 1"] interval: 30s timeout: 10s retries: 5 start_period: 30sRunning the Stack
Getting the analytics platform running requires just a few commands. The Makefile abstracts Docker Compose complexity into memorable targets.
Quick Start
cd analytics
# 1. Start all servicesmake up
# 2. Deploy flows to Prefect (registers schedules)make deploy
# 3. Run initial backfillmake run-backfillAfter these commands complete, you have:
- Prefect UI at
http://localhost:4200 - Metabase at
http://localhost:3001 - dbt docs at
http://localhost:8080
Essential Makefile Commands
# Infrastructuremake up # Start all servicesmake up-prefect # Start only Prefect (no Metabase)make down # Stop all servicesmake logs # Follow worker logsmake shell # Interactive shell in worker
# Deploymentsmake deploy # Deploy flows to Prefectmake status # List deployments
# Pipeline Executionmake run-adhoc # Incremental runmake run-backfill # Full historical backfillmake run-daily # Daily full refreshmake pipeline # Run directly (bypass Prefect)Scheduled Deployments
Once deployed, four pipeline variants run on schedule:
| Deployment | Schedule | Use Case |
|---|---|---|
| hourly-analytics | Every hour | Incremental sync |
| daily-full-refresh | 2:00 AM | Rebuild all tables |
| adhoc-analytics | Manual | On-demand runs |
| full-backfill | Manual | Initial load or recovery |
The hourly schedule ensures dashboards reflect recent activity, while the daily full refresh rebuilds aggregates and catches any missed incremental updates.
Development vs Production
For local development, the default configuration works well. Production deployments should consider:
- External MongoDB: Update
MONGO_URIto point to your production database - Persistent storage: Map volumes to durable storage, not ephemeral Docker volumes
- Resource limits: Add memory and CPU constraints to prevent runaway queries
- Monitoring: Export Prefect and Metabase metrics to your observability stack
Data Model Preview
The gold layer implements a star schema optimized for analytics queries. Here is a simplified view of the core entities:
This schema supports questions like:
- “How many messages were exchanged last week?” (join FCT_MESSAGES with DIM_DATE)
- “Which projects have the longest average session duration?” (aggregate DIM_SESSIONS by project)
- “What is the tool usage breakdown by day of week?” (join FCT_TOOL_CALLS with DIM_DATE)
The aggregate tables pre-compute common queries. For example, agg_daily_summary provides:
| Metric | Description |
|---|---|
| total_sessions | Sessions started that day |
| total_messages | Messages exchanged |
| total_tool_calls | Tools invoked |
| avg_session_duration | Mean session length in seconds |
| unique_projects | Distinct projects active |
Dashboard widgets query these aggregates directly, returning results in milliseconds rather than scanning fact tables.
Series Navigation
This article provided the architectural overview of our analytics platform. We covered the medallion pattern, technology choices, and infrastructure setup. Now you understand how the pieces fit together.
The next three articles dive deep into each major component:
Article 5: Python ETL with Prefect Orchestration
- MongoExtractor implementation details
- High water mark state management
- Prefect flow and task definitions
- Error handling and retry strategies
Article 6: Data Modeling with dbt
- Staging model patterns
- Intermediate enrichment logic
- Dimensional modeling decisions
- Testing and documentation
Article 7: Building Dashboards with Metabase
- DuckDB driver setup
- Dashboard design principles
- Key visualizations for conversation analytics
- Sharing and embedding options
Each article includes complete code examples and practical guidance. Whether you are building a similar platform or adapting these patterns for different data sources, the series provides a comprehensive reference.
Wrapping Up
Building an analytics platform is a journey from raw data to actionable insight. The medallion architecture gives us a clear mental model: bronze for raw data, silver for business logic, gold for consumption. Our technology stack (Python, Prefect, DuckDB, dbt, Metabase) balances power with operational simplicity.
The entire platform runs on a single machine using Docker Compose. No cloud data warehouse fees, no complex distributed systems. For many use cases, this simplicity is not a limitation but a feature. You can always scale up later, but starting simple lets you iterate quickly and understand your data deeply.
If you are building analytics on top of application data, I encourage you to try this stack. Clone the repository, run make up, and explore the conversation logs. The patterns demonstrated here apply far beyond AI assistant analytics. Any application generating semi-structured event data can benefit from similar treatment.
In Article 5, we will open the hood on the Python ETL code. See you there.
Tags: Data Engineering, Python, dbt, Analytics, Docker
Have questions or feedback? Share your thoughts in the comments below. If you found this article helpful, follow for the next installment in the series.