About Me

Hi, I’m Sarthak Makhija, Principal Architect at Caizin. I write long-form essays on refactoring, storage engines, databases, and engineering trade-offs.
Prior to joining Caizin, I was with Thoughtworks where I led a team that developed a strongly consistent, distributed key/value storage engine in Go. This system was built with a focus on high availability and strict correctness, featuring:
- Core Storage & Coordination: Badger as the underlying local key/value engine, with etcd managing cluster metadata.
- Distribution & Sharding: Hash partitioning for data distribution across the cluster, using consistent hashing for the assignment of partitions/shards.
- Consistency & Consensus: Raft/Multi-Raft for consensus, paired with two-phase commit ensuring a serial isolation level.
- Networking: Persistent TCP connections for efficient, low-latency node-to-node communication.
I also enjoy sharing my knowledge and contributing to the broader engineering community:
- Authoring: I contributed to the validation of distributed system patterns in the book Patterns of Distributed Systems by Unmesh Joshi. I authored articles on persistent memory for Marcin Moskala.
- Workshops: I design and facilitate hands-on, deep-dive workshops focused on mastering software craftsmanship and storage internals.
Additionally, I spend time building educational systems from scratch to demystify how databases and distributed systems work under the hood.
Currently exploring
- Building a query engine (Relop) in Rust + writing a series on internals of query engine
- Reading the book Performance Analysis and Tuning on Modern CPUs
- Writing technical essays on tech-lessons.in
Talks
Questioning database claim: Design patterns of storage engines
I gave a talk on “Questioning database claims: Design patterns of storage engines” at GoConIndia24 on 2nd December. Link to the talk.
The idea of the talk was to understand various patterns of storage engines (/key-value storage engines) like persistence (WAL, fsync), efficient retrieval (B+tree, bloom filters, data layouts), efficient ingestion (Sequential IO, LSM, Wisckey) and then question variety of database claims like durability, read optimization, write optimization and pick the right database(s) for our use case.
Some Projects
🔹 Relop Relop is a minimal, in-memory implementation of relational operators built to explore query processing. It covers the entire pipeline from lexical analysis and parsing to logical planning and execution.
Key Features
- SQL Support: Supports basic selection, filtering (WHERE), ordering, and joins.
- Educational Focus: Built with a focus on understanding the internals of a query engine, inspired by Crafting Interpreters and Database Design and Implementation.
- End-to-End Pipeline: Implements the query parsing flow including tokenization, AST generation, logical plans, and physical execution via iterators.
🔹 Go-LSM LSM-based key-value store in Go for educational purpose, inspired by LSM in a Week. It is a rewrite of the existing workshop code.
Exploring LSM with go-lsm
- Learn LSM from the ground up: Dive deep into the core concepts of Log-Structured Merge-Trees (LSM) through a practical, well-documented implementation.
- Benefit from clean code: Analyze a meticulously crafted codebase that prioritizes simplicity and readability.
- Gain confidence with robust tests: Verify the correctness and reliability of the storage engine through comprehensive tests.
- Experiment and extend: Customize the code to explore different LSM variations or integrate it into your own projects.
🔹 clearcheck Write expressive and elegant assertions with ease! clearcheck is designed to make assertion statements in Rust as clear and concise as possible. It allows chaining multiple assertions together for a fluent and intuitive syntax, leading to more self-documenting test cases.
let pass_phrase = "P@@sw0rd1 zebra alpha";
pass_phrase.should_not_be_empty()
.should_have_at_least_length(10)
.should_contain_all_characters(vec!['@', ' '])
.should_contain_a_digit()
.should_not_contain_ignoring_case("pass")
.should_not_contain_ignoring_case("word");