Trainings
I design and facilitate hands-on, deep-dive trainings focused on mastering software craftsmanship and storage internals.
Interested in running a training for your team? Email sarthak.makhija@gmail.com or reach out on LinkedIn.
Gamifying Refactoring
Turn the art of refactoring into a measurable science. This training turns code cleanup into a challenge: can you justify a code smell without using vague terms like "readability" or "maintainability"?
The Game Rules
- 1. Identify smells using concrete evidence, not gut feelings.
- 2. Justify your refactoring without using "ilities" (readability, flexibility).
- 3. Go Beyond abstract reasoning to find measurable problems.
"Don't state 'Long method is a smell because it is not readable'. Consider this an opportunity to find reasoning that is measurable."
After this training, you will be able to
- 1. Recognize code smells with precision, using evidence rather than intuition
- 2. Justify refactoring decisions without leaning on vague terms like "readability" or "maintainability"
- 3. Make small, safe, incremental changes that improve code without breaking behavior
Internals of key-value storage engines: LSM-trees and beyond
Participants build an LSM-based storage engine to understand the core components of a key-value store.
Day 1: Foundations & Core Structures
Building the theoretical ground and starting the engine.
Introduction: Theory
- Overview of a storage engine
- Introduction to block storage devices (HDD)
- File organization on disk
- Standard IO & Kernel Page Cache
- Storage hierarchy & Block data structure goals
B+Tree: Theory and Maths
- Binary Search Tree (BST) & Height Calculation
- Can BST be used for disk persistence?
- B+Tree Architecture & Use-cases
- Disk access patterns: Random vs Sequential
LSM Introduction: Hands-on
- Intro to LSM-tree components
- Implementing Memtable + Iterator
- Implementing Write Ahead Log (WAL)
- Recovering from WAL
- Understanding WAL implementation patterns
Day 2: Advanced Data Structures & Transactions
Moving to disk, optimization, and ACID properties.
SSTables: Hands-on
- Understanding SSTable structure
- Revising Binary Search
- Encoding & Endianness
- Implementing SSTable + Iterator
Bloom Filters: Hands-on
- Naive Bloom Filter implementation
- Theory behind Bloom Filters
- Robust Bloom Filter implementation
- Integrating into SSTable
Transactions (ACID): Hands-on
- Understanding A.C.I.D properties
- Atomicity & Durability implementation
- Isolation Levels explained
- Serialized Snapshot Isolation (Theory)
Concurrency: Hands-on
- Singular Update Queue
- Implementing Serialized Snapshot Isolation
- Transactions + Iterators
- Introduction to Concurrency
Compaction
- Why do we need compaction?
- Simple-Leveled Compaction (Theory)
- Implementing & Integrating Compaction
- Concurrency + Compaction Revision
After this training, you will be able to
- 1. Understand the fundamentals of storage engines and how data actually moves from memory to disk
- 2. Understand how LSM-trees achieve write optimization, from memtable flushes to compaction strategies
- 3. Understand why write-optimized databases make the trade-offs they do, and when those trade-offs are worth it