The focus of this repo is practical implementation + research understanding of modern language models (LMs), including building, training, fine-tuning, and aligning models from scratch
- Developing a good research foundation
- Learning and building things from First Principles ❤️
- Apply research thinking to solve real-world problems
- Training a full LLM is computationally expensive → focus is on Small Language Models (SLMs)
- Concepts scale directly from SLM → LLM
This work is deeply inspired by the incredible body of research and open contributions in the field of language modelling. I would like to acknowledge:
- Foundational research papers such as Attention Is All You Need, which introduced the Transformer architecture and revolutionized modern NLP
- Key advancements in large-scale language modelling, including works behind models like GPT and BERT
- Open-source communities, blogs, and educational resources that make complex concepts accessible and reproducible
This repository is an attempt to learn from these works, re-implement ideas from first principles, and build an intuitive as well as practical understanding of language models