In this post, we’ll implement a GPT from scratch in just 60 lines of numpy. We’ll then load the trained GPT-2 model weights released by OpenAI into our implementation and generate some text.
The result is PicoGPT. Very cool. I’m a fan of simple educational implementations.