about - naklecha.com

# about me [twitter](https://twitter.com/naklecha) [github](https://github.com/naklecha) [buymeacoffee](https://www.buymeacoffee.com/naklecha) [instagram](https://instagram.com/naklecha) Hey! I lead machine learning at [glaive.ai](https://glaive.ai/), where we produce use-case specific training data and llm finetunes. We have one of the fastest and highest quality synthetic data pipeline in the world, reach out to me if you are interested in knowing more or if you would like to take the under on my claim, I love free money :) Also, I founded [aaaaaaaaaa.org](https://aaaaaaaaaa.org/) with a mission of making research more accessible! We (currently a team of 6) are doubling down on creating a meaningful bridge for new people to enter fields that we find fascinating. This is where I spend a majority of my free time & I genuinely believe that this is a real and growing problem that needs to be solved! ### ongoing projects This is a rough breakdown of what I do with my free time -- - 25% spent building rocket league bots using reinforcement learning - 25% spent studying new machine learning topics/reading papers - 25% spent working on a new comprehensive data parallelism blog, my tweet [here](https://x.com/naklecha/status/1889361800677786086) - 25% spent going on walks, listening to music, reading books, podcasts, videos etc ### highlights - 2025, my [reinforcement learning guide](https://x.com/naklecha/status/1878080308903284866) went viral and was read by 15k+ people. I also got a *"Yo love the RL guide"* message from [REDACTED], someone I really admire and a billionaire. That was fun. ![[rltweet.png | 300]] - 2024, my [llama3-from-scratch](https://github.com/naklecha/llama3-from-scratch) implementation went extremely viral, my repository currently has 14k+ GitHub stars (this blog was read by 50-100k people) & it has been translated into many languages by the open source community. It went viral because [Andrej Karpathy](https://en.wikipedia.org/wiki/Andrej_Karpathy) replied and retweeted [my tweet](https://x.com/naklecha/status/1792244347225641338) and the [repository was #1 on HackerNews](https://news.ycombinator.com/item?id=40408880) for 24 hours and one of the highest voted posts of that week with 1000+ upvotes. ![[llama3github.png | 300]] ![[andrejtweet.png | 300]] ![[hn.png | 300]] - 2023, I won $25k at a game show style contest called [Buildspace](https://buildspace.so/) (season 4). Being of the winners of the event was one of the craziest things that has ever happened to me & shaped the way I think about building ideas :') - 2022-2023, I built a bunch of apps. Building something people want is the ultimate form of truth seeking and I believe everyone needs a time in their life where they practice that muscle. These were the apps that I built and their usage: [stockmusic.app](https://stockmusic.app/) -- 15,000 users [whatonearth.ai](https://whatonearth.ai/) -- 40,000 users [fashionai.me](https://fashionai.me/) -- 31,000 users [google colab copilot](https://copilot.naklecha.com/) -- 22,000 users ![[analytics.png]] - 2022, This was the year I went absolute demon mode and won 7 straight hackathons in a row and won ~$10k worth in prize money (which was a lot of money to me at the time). This year was the year in which I 10xed my shipping speed :) - 2015-2017, During this time, I spent some time competitive programming. I was world #6 on [Hackerrank](https://en.wikipedia.org/wiki/HackerRank)'s algorithm leaderboard & in 2021 I had a rating of 2100+ on [CodeChef](https://en.wikipedia.org/wiki/CodeChef)'s leaderboard in <10 competitions. ![[codechef.png| 300]] ### career and history Most of my work at [glaive.ai](https://glaive.ai/) is still on the way for [REDACTED] companies, but these are the ones I can share that are public: - In Feb 2025, we trained a speculative decoding draft model for [Groq](https://en.wikipedia.org/wiki/Groq)'s R1 model. We were able to achieve a pretty high token acceptance rate (using a very small model). The R1 70b model on Groq hits >1.6k tokens/sec with the help of the draft model we trained. Image of famous tech Youtuber [Theo](https://www.youtube.com/@t3dotgg) reacting to the model's performance. [Link to reaction](https://x.com/bcjordan/status/1887323628737282403?s=46) :) ![[groqspec.png | 300]] - In July 2024, we worked with Groq to train the state of the art (and open source) function calling model that was deployed on Groq's infrastructure that was able to produce high quality tool use outputs at >1k tokens per second. I worked with [Sahil](https://x.com/csahil28) and [Rick](https://x.com/RickLamers) on this project. We even got shout out by [Yann LeCun](https://en.wikipedia.org/wiki/Yann_LeCun) for our model :) ![[bfcl.png | 300]] ![[yann.png | 300]] - In 2022-2024, I was doing my (Artificial Intelligence) Master's degree at [TU Delft](https://en.wikipedia.org/wiki/Delft_University_of_Technology) but, I eventually dropped out of the university to work on much much harder problems at Glaive. - In 2020-2022, at [Morgan Stanley](https://www.morganstanley.com/) (the investment bank) I was working on a search system for our (very important & rich) clients. My role in the team was to classify all existing and upcoming analyst articles. To do this, I finetuned a [BERT](https://en.wikipedia.org/wiki/BERT_(language_model)) classifier & it was a fun experience. In 2020, language model tooling was really bad which made for a good learning experience -- for example, to fine-tune BERT on financial data I had to manually open the `tokenizer.json` file and other config files to add finance specific special tokens. I don't remember much about it, but I do remember that I was able to finetune a pretty useful classifier, I was able to deploy the model to production and it was used by all our happy clients :)