Sarthak Makhija
Sarthak Makhija Databases & Storage Systems
Sarthak Makhija
Sarthak Makhija Databases & Storage Systems

Trainings

I design and facilitate hands-on, deep-dive trainings focused on mastering software craftsmanship and storage internals.

Interested in running a training for your team? Email sarthak.makhija@gmail.com or reach out on LinkedIn.

Gamifying Refactoring

Format: Instructor-led training Duration: 5 hours Availability: On request

Turn the art of refactoring into a measurable science. This training turns code cleanup into a challenge: can you justify a code smell without using vague terms like "readability" or "maintainability"?

The Game Rules

  • 1. Identify smells using concrete evidence, not gut feelings.
  • 2. Justify your refactoring without using "ilities" (readability, flexibility).
  • 3. Go Beyond abstract reasoning to find measurable problems.

"Don't state 'Long method is a smell because it is not readable'. Consider this an opportunity to find reasoning that is measurable."

After this training, you will be able to

  • 1. Recognize code smells with precision, using evidence rather than intuition
  • 2. Justify refactoring decisions without leaning on vague terms like "readability" or "maintainability"
  • 3. Make small, safe, incremental changes that improve code without breaking behavior

Internals of key-value storage engines: LSM-trees and beyond

Format: Instructor-led training Duration: 2 days Availability: On request

Participants build an LSM-based storage engine to understand the core components of a key-value store.

Day 1: Foundations & Core Structures

Building the theoretical ground and starting the engine.

Section 1

Introduction: Theory

  • Overview of a storage engine
  • Introduction to block storage devices (HDD)
  • File organization on disk
  • Standard IO & Kernel Page Cache
  • Storage hierarchy & Block data structure goals
Section 2

B+Tree: Theory and Maths

  • Binary Search Tree (BST) & Height Calculation
  • Can BST be used for disk persistence?
  • B+Tree Architecture & Use-cases
  • Disk access patterns: Random vs Sequential
Section 3

LSM Introduction: Hands-on

  • Intro to LSM-tree components
  • Implementing Memtable + Iterator
  • Implementing Write Ahead Log (WAL)
  • Recovering from WAL
  • Understanding WAL implementation patterns

Day 2: Advanced Data Structures & Transactions

Moving to disk, optimization, and ACID properties.

Section 4

SSTables: Hands-on

  • Understanding SSTable structure
  • Revising Binary Search
  • Encoding & Endianness
  • Implementing SSTable + Iterator
Section 5

Bloom Filters: Hands-on

  • Naive Bloom Filter implementation
  • Theory behind Bloom Filters
  • Robust Bloom Filter implementation
  • Integrating into SSTable
Section 6

Transactions (ACID): Hands-on

  • Understanding A.C.I.D properties
  • Atomicity & Durability implementation
  • Isolation Levels explained
  • Serialized Snapshot Isolation (Theory)
Section 7

Concurrency: Hands-on

  • Singular Update Queue
  • Implementing Serialized Snapshot Isolation
  • Transactions + Iterators
  • Introduction to Concurrency
Section 8

Compaction

  • Why do we need compaction?
  • Simple-Leveled Compaction (Theory)
  • Implementing & Integrating Compaction
  • Concurrency + Compaction Revision

After this training, you will be able to

  • 1. Understand the fundamentals of storage engines and how data actually moves from memory to disk
  • 2. Understand how LSM-trees achieve write optimization, from memtable flushes to compaction strategies
  • 3. Understand why write-optimized databases make the trade-offs they do, and when those trade-offs are worth it