$\newline Logo$

Go to Preview Lesson

Go to Preview Lesson

LESSON 2.1What is a tokenizer? What makes a tokenizer "good"?

LESSON 2.2Build a baseline 'word-based' tokenizer

AI Bootcamp

MODULE 1
Introduction to AI applications
This module introduces foundational concepts and practical workflows for working with Large Language Models (LLMs). Topics include terminology (e.g., ChatGPT vs. LLM, inference phases, training stages, and model compression techniques), the LLM ecosystem (vector databases, inference APIs, and fine-tuning libraries), and the model lifecycle. Participants will build a simple LLM-based system from scratch, starting with “Hello World” inference using Hugging Face, and deploy an LLM API using Modal for serverless deployment.
MODULE 2
Building a Shakespearean Language Model
Building a Shakespearean Language Model
MODULE 3
Building an n-gram language model
Building an n-gram language model
MODULE 4
Building self-attention
Building self-attention
- LESSON 4.1A minimal version of self-attention
- LESSON 4.2Build a batched version of self-attention
MODULE 5
Building the feed-forward neural network
Building the feed-forward neural network
MODULE 6
Assembling the transformer-based language model
Assembling the transformer-based language model
MODULE 7
Evaluating and deploying a transformer-based language model
Evaluating and deploying a transformer-based language model
MODULE 8
Datasets
Datasets
MODULE 9
Low-Rank Adapters for Instruction Tuning
Low-Rank Adapters for Instruction Tuning
MODULE 10
Retrieval-Augmented Generation (RAG)
Retrieval-Augmented Generation (RAG)
MODULE 11
The Future of Large Language Models
The Future of Large Language Models
MODULE 12
Machine learning operations
Machine learning operations
MODULE 13
Agents
Agents
- LESSON 13.1What are agents?
- LESSON 13.2Design patterns

Go to Next Lesson

Go to Next Lesson

LESSON 2.3Build a byte-pair encoding

Go Pro
Log In

Build a baseline 'word-based' tokenizer

Courses
AI Bootcamp
Build a baseline 'word-based' tokenizer

Go To Previous Lesson
What is a tokenizer? What makes a tokenizer "good"?
Go To Next Lesson
Build a byte-pair encoding