nanochat by Andrej Karpathy
Andrej Karpathy recently released this new GitHub repo called nanochat using which one can train a mini version of a ChatGPT-like LLM under $100. The repo is described as:
The best ChatGPT that $100 can buy.
This repo is a full-stack implementation of an LLM like ChatGPT in a single, clean, minimal, hackable, dependency-lite codebase. nanochat is designed to run on a single 8XH100 node via scripts like speedrun.sh, that run the entire pipeline start to end. This includes tokenization, pretraining, finetuning, evaluation, inference, and web serving over a simple UI so that you can talk to your own LLM just like ChatGPT. nanochat will become the capstone project of the course LLM101n being developed by Eureka Labs.
Karpathy says that:
My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo.
This is awesome! One can learn so much about training LLMs from this single repo.
Comment via email