From 72543e7bd7aa19bcfa29c2f9c1ef4bafc5d2dad2 Mon Sep 17 00:00:00 2001 From: David Lohmann <5475305+DLohmann@users.noreply.github.com> Date: Sun, 12 Oct 2025 00:49:57 -0700 Subject: [PATCH] Add visual to section3.md --- _includes/section3.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/_includes/section3.md b/_includes/section3.md index cebdb4d..67d14f3 100644 --- a/_includes/section3.md +++ b/_includes/section3.md @@ -1,5 +1,6 @@ -In this section, we'll explore a number of concepts which will take us from the decoder-only Transformer architecture towards understanding the implementation choices and tradeoffs behind many of today's frontier LLMs. If you first want a birds-eye view the of topics in section and some of the following ones, the post ["Understanding Large Language Models"](https://magazine.sebastianraschka.com/p/understanding-large-language-models) by Sebastian Raschka is a nice summary of what the LLM landscape looks like (at least up through mid-2023). +In this section, we'll explore a number of concepts which will take us from the decoder-only Transformer architecture towards understanding the implementation choices and tradeoffs behind many of today's frontier LLMs. If you first want a birds-eye view the of topics in section and some of the following ones, the post ["Understanding Large Language Models"](https://magazine.sebastianraschka.com/p/understanding-large-language-models) by Sebastian Raschka is a nice summary of what the LLM landscape looks like (at least up through mid-2023). For a visual overview of the scale and processing done by GPT-2 (small), nano-GPT, GPT-2 (XL), and GPT-3, the [LLM Visualization](https://bbycroft.net/llm) by Brendan Bycroft provides a detailed guide ([source](https://github.com/bbycroft/llm-viz)). +