Projects#
A collection of work spanning technical leadership, research, education, and creative storytelling—unified by a focus on making complex systems understandable and impactful.
Featured Work#
Deep Learning at Scale#
O'Reilly Media, 2024 | Type: Technical Book | Status: Published

A comprehensive guide to building and deploying deep learning systems that scale effectively. This book bridges the gap between research and production, covering hardware optimization, distributed training, model serving, and efficient deployment strategies.
Addresses the critical challenge facing ML teams: moving from prototype to production-grade systems that handle real-world scale, reliability, and performance requirements.
Impact: Used by ML teams at major tech companies | Featured in university ML engineering courses | Available globally via O'Reilly platform
Curious Cassie Children's Book Series#
2022-Present | Type: Children's Literature | Status: Published

Educational chapter book series for ages 5-10 that celebrates scientific discovery through the adventures of Cassie, an endlessly curious 6-year-old. Each book honors a pioneering scientist, weaving their discoveries into engaging narratives.
Designed to combat passive digital entertainment by offering active, imaginative engagement with stories of scientists and philosophers who shaped our understanding of the world.
Series Status: 3 books published (Isaac Newton, Emma Johnston, Carol Dweck) | Available on Amazon and FableFlow
FableFlow: AI-Powered Book Production#
2025 | Type: Open Source Platform | Status: Active Development
An open-source agentic AI pipeline that transforms story manuscripts into complete multimedia experiences—illustrations, narration, music, and multi-format output (EPUB, PDF, HTML, video). Built to democratize professional children's book production by reducing costs by 10-100x and timelines from months to days.
Deliberately built on open-source models and fully open source itself. The project embodies the belief that AI-assisted creative tools should be open, auditable, and accessible—not locked behind proprietary walls.
Key Features:
- Multi-stage production pipeline with specialized AI agents
- FableFlow Studio for interactive refinement and iteration
- Human-in-the-loop design preserving creative control
- Support for open-source models (FLUX, etc.)
Impact: Powers the Curious Cassie series production | Available for independent authors worldwide
Read more → | GitHub → | Documentation →
Research & Analysis#
Reproducible Machine Learning#
Type: Framework & Best Practices | Status: Completed
Comprehensive framework for achieving reproducibility in deep learning research and production systems. Addresses the challenge that most ML experiments cannot be reliably reproduced, undermining scientific rigor and production reliability.
Key Contributions: - 4-part O'Reilly interactive learning series - Practical techniques for deterministic training - Analysis of reproducibility challenges across hardware, software, and data
Impact: Adopted by research teams and production ML organizations seeking reliable experimentation
Learn more → | O'Reilly Course →
Label Noise with CleanLab#
Type: Data Quality Research | Status: Completed
Research on data quality and label noise detection in machine learning datasets using confident learning techniques. Explores how mislabeled data impacts model performance and strategies for systematic identification and correction.
Key Focus: - Impact of label noise on model accuracy - Confident learning methodology - Practical applications in medical imaging
Feature Analysis: t-SNE vs UMAP#
Type: Comparative Analysis | Status: Completed
Detailed comparison of dimensionality reduction techniques (t-SNE vs UMAP) for understanding and visualizing neural network representations. Provides practical guidance for practitioners choosing visualization methods for high-dimensional embeddings.
Techniques Covered: - t-SNE methodology and applications - UMAP advantages and limitations - Practical recommendations for different use cases
Tools & Open Source#
KCD (Kubernetes Cluster Debugger)#
Type: DevOps Tool | Status: Open Source
Tools and utilities for debugging Kubernetes clusters and understanding pod lifecycle issues. Born from practical experience running ML workloads on Kubernetes and encountering common failure modes.
Focus Areas: - Pod OOM (Out of Memory) debugging - Container lifecycle analysis - Resource allocation troubleshooting
Educational Resources#
O'Reilly Interactive Learning Series#
Type: Online Course | Status: Published
Four interactive learning scenarios covering reproducible machine learning, published on the O'Reilly platform. Hands-on, practical approach to understanding reproducibility challenges and solutions.
Modules: 1. Semantic Segmentation on Oxford Pets Dataset 2. Identifying the Reproducibility Challenge 3. Random Seeds and Process-Parallelism 4. Achieving 100% Reproducibility
Format: Interactive coding scenarios with immediate feedback | Self-paced learning
Project Themes#
Making Complex Systems Understandable#
Whether it's scaling deep learning infrastructure, explaining Isaac Newton to 6-year-olds, or debugging Kubernetes clusters—the common thread is making complex technical systems accessible and actionable.
From Research to Production#
Projects emphasize the gap between research prototypes and production systems, providing practical guidance for teams building real-world ML applications.
Education Across Audiences#
Educational content spans from children's books sparking early scientific curiosity to advanced technical training for ML practitioners—different audiences, same commitment to clarity and impact.
Archive#
IBM Redbooks: Creating Plugins for Lotus Notes, Sametime, and Symphony (2011) Technical guide for enterprise software development in the Lotus ecosystem.
Projects reflect 18+ years of work across medical imaging, ML engineering, education, and creative writing—unified by curiosity, rigor, and a desire to make technical knowledge accessible.