About Me
Posted on February 24, 2023 • 3 minutes • 570 words
Table of contents
Sarthak Makhija
My Projects
I love working on my projects in my free time. Some of my projects include –
🔹 CacheD
CacheD is a high performance , LFU based in-memory cache in Rust inspired by Ristretto . The crate for CacheD is available here .
🔹 bitcask
This is the golang implementation of Riak’s bitcask paper. This project is for educational purposes only. The idea is to provide a reference implementation of bitcask to help anyone interested in storage engines understand the basics of building a persistent key-value storage engine.
🔹 goselect
goselect provides SQL-like ‘select’ interface for files. This means one can execute a select query like:
select name, path, size from . where or(like(name, result.*), eq(isdir, true)) order by 3 desc
to get the filename, file path and size of all the files that are directories or their names begin with the term result. This query orders the results by size in descending order.
I created goselect to understand the following:
- Parsing: The parsing pipeline typically involves a lexer, a parser and an AST.
- Recursive descent parser
- Representation of functions in the code. Functions like
lower
,upper
andtrim
take a single parameter, functions likenow
andcurrentDate
take zero parameters, whereas functions likeconcat
take a variable number of parameters. - Execution of scalar functions like
lower
,upper
andsubstr
. These functions are stateless and run for each row. They may take a parameter and return a value for each row. These functions can also involve nesting. For example,lower(substr(ext, 1))
. - Execution of aggregation functions like
average
,min
,max
andcountDistinct
. These functions run over a collection of rows and return an ultimate value. These functions can also involve nesting. For example,average(count())
. - Execution of nested scalar and aggregation functions like
countDistinct(lower(name))
. Here, the functionlower(name)
runs for each row, whereascountDistinct
runs over a collection of rows.
Refactoring is fun to learn and practice. It is even more fun to understand it together by playing a game.
All you got to do is - identify code smells, justify each of them by going beyond ilities, finish all of this in a fixed time and get points for your team. Learn and have fun. This is the idea behind “Gamifying Refactoring”.
Data Anonymization tool helps build anonymized production data dumps, which can be used for performance testing, security testing, debugging and development. We implemented this tool in Kotlin, which works with Java 8 & Kotlin.
🔹 Flips
Flips is an implementation of Feature Toggles pattern for Java. Feature Toggles are a powerful technique, allowing teams to change system behaviour without changing the code.
The idea behind Flips is to let the users implement toggles with minimum configuration and coding. This library should work with Java8, Spring Core / Spring MVC / Spring Boot.
I have a separate blog about this library.
Next project
My current interest is in Designing storage engines and databases, and I plan to build a key/value storage engine in Rust. The idea is to create a write-optimized storage engine using an LSM tree and provide read-optimized paths. Some ideas that I would like to explore are:
- Thread-per-core programming model
- Separating values from keys: WiscKey
- Explore io_uring and glommio
- Cache bloom filters in memory
- Cache level-0 SSTables in memory