Fundamentals of transformers - Live Workshop
Everyone knows chatgpt, but how do modern large language models fully work? The fundamentals start at the transformer. This workshop is a workshop to dymstify the transformer and be able to run through concept to code on how the transformer work. This workshops combines concept at an intutive level, to code, to math all with the intent at providing an end to end understanding at the fundamentals of large language models.
- 5.0 / 5 (1 ratings)
- Published
- Updated
2 hrs 42 mins
20 Videos
Alvin Wan
Currently at OpenAI. Previously he was a Senior Research Scientist at Apple working on large language models with Apple Intelligence. He formerly worked on Tesla AutoPilot and graduated with his PhD at UC Berkeley with 3000+ citations and 800+stars for his work.
01Live and remote
You can take the workshop from anywhere in the world, as long as you have a computer and an internet connection. You also have the opportunity to ask the instructors questions live.
02Recorded
Learn at your own pace, whenever it's convenient for you. With no rigid schedule to worry about, you can take the course on your own terms.
03Build
Learn by building while you learn the concepts.
04Community
Join a vibrant community of other students who are also learning with Fundamentals of transformers. Ask questions, get feedback and collaborate with others to take your skills to the next level.
What You Will Build In This Workshop
Intro to LLM Basics: Learn foundational concepts, including terminology like models, data, algorithms, and optimization.
Autoregressive Decoding: Grasp how LLMs predict words through conditional generation, supported by manual inference demos.
LLM Prediction Mechanism: Explore LLM architecture, with an intuitive look at vectors and word embeddings.
Semantic Meaning in Embeddings: See how word embeddings represent semantic meaning through nearest neighbors and vector demos.
Transformer Core Mechanics: Unpack the inner workings of a transformer layer, including self-attention and context addition.
Non-Linear Transformations: Discover why non-linearities are essential, supported by hands-on matrix multiplication and MLP demos.
Positional Encoding: Learn absolute and relative positional encoding techniques, plus RMS Norm for positional bias management.
Differences between absolute and relative positional encoding
Attention Mechanisms: Delve into “forward-facing” and multi-head attention to understand attention values.
Advanced Attention: Study grouped-query attention and its importance in handling large data.
Current Transformer Models: Analyze academic and modern transformer diagrams, identifying bottlenecks in today’s LLMs.
Build a Mini LLM Inference Tool: Create a simplified version of Huggingface’s LLM utility to understand LLM operation.
Understand Word Embeddings: Develop interactive demos exploring word embeddings and how models represent words
Visualize Self-Attention: Use visualization tools to understand the role of self-attention in language models.
In this workshop, we dive deep into Large Language Models (LLMs) to help you understand, build, and optimize their architecture for real-world applications. LLMs are transforming industries—from customer support to content creation—but understanding how these models work, let alone optimizing them, can be challenging.
In this comprehensive 9-module series, we cover:
The technical essentials of LLMs, including autoregressive decoding, positional encoding, and multi-head attention The entire LLM lifecycle, from pretraining on massive datasets to fine-tuning and instruction tuning for specialized tasks Best practices for evaluating LLMs, identifying bottlenecks, and leveraging state-of-the-art architectures for efficiency and scalability
this workshop includes hours of in-depth instruction, hands-on coding exercises, and access to a community forum for support and discussions. You'll also gain exclusive access to source code templates, an expansive reference library, and downloadable materials for continued learning.
It's taught by Alvin Wan, a Senior Research Scientist at Apple and a PhD student at UC Berkeley with international recognition for his impactful contributions in efficient AI and design. With his practical industry experience and research insights, you’ll be guided from fundamentals to advanced concepts with clarity and precision.
By the end of this workshop, you’ll not only understand how to create and optimize LLMs but also how to apply this knowledge across various applications in tech and business.
Our students work at
Workshop Syllabus and Content
What are LLMs?
2 Lessons 11 Minutes
Demystifying terminology behind LLMs
- 01IntroSneak Peek00:07:09
- Sneak Peek00:04:06
What LLMs predict
3 Lessons 20 Minutes
Introduction to Autoregressive Decoding
- 01TokensSneak Peek00:07:12
- Sneak Peek00:09:28
- Sneak Peek00:04:01
How LLMs predict
3 Lessons 19 Minutes
The architecture for a Large Language Model
- Sneak Peek00:04:37
- Sneak Peek00:04:17
- Sneak Peek00:10:13
How Transformers predict
4 Lessons 40 Minutes
The innards of a transformer layer
- Sneak Peek00:09:19
- Sneak Peek00:15:31
- Sneak Peek00:07:27
- Sneak Peek00:08:34
How LLMs use position
5 Lessons 39 Minutes
How to Leverage Positional Bias
- Sneak Peek00:04:08
- Sneak Peek00:08:29
- Sneak Peek00:05:31
- Sneak Peek00:14:34
- 05RMS normSneak Peek00:06:46
How LLMs attend
1 Lesson 7 Minutes
How to find the needle in the haystack
- Sneak Peek00:07:31
Modern LLM connection to papers
2 Lessons 23 Minutes
Connection to papers
- Sneak Peek00:03:55
- 02Q&ASneak Peek00:19:48
Subscribe for a Free Lesson
By subscribing to the newline newsletter, you will also receive weekly, hands-on tutorials and updates on upcoming courses in your inbox.
What Students are Saying
Meet the Workshop Instructor
Purchase the course today
Frequently Asked Questions
How is this workshop structured, and what topics does it cover?
this workshop introduces Large Language Models (LLMs) from foundational concepts to practical applications. Key topics include terminology, architecture (such as transformers and embeddings), ecosystem overview, market applications, developer tools, product integrations, and predictive mechanisms in LLMs. It also delves into advanced topics like autoregressive decoding, multi-head attention, and performance optimization techniques.
Is this workshop suitable for my skill level?
The course is designed for learners with a basic understanding of programming and machine learning concepts. However, it covers a range of levels. For beginners, fundamental concepts are introduced in detail, while advanced sections, such as transformer architecture and multi-query attention, offer in-depth insights. You can skip advanced sections and focus on introductory modules if you’re newer to the topic.
Will I get real-world examples and practical applications in this workshop?
Yes, the course emphasizes practical, real-world applications of LLMs, with use cases such as chatbots, coding assistants, and data analysis tools. For example, demonstrations of how embeddings and transformer layers work are provided to help bridge theoretical knowledge with application.
How frequently is the course content updated?
The course content is reviewed and updated regularly to keep pace with advances in LLM technologies and AI development practices. This includes updates to information about tools, libraries, and popular AI-centric platforms, ensuring relevance in a rapidly evolving field.
Does this workshop cover current AI tools and integrations?
Yes, we cover a broad spectrum of contemporary tools and integrations. This includes popular platforms like Apple Intelligence, Google AI, and tools for developers such as vector databases and fine-tuning frameworks, ensuring a comprehensive understanding of the LLM ecosystem.
How are complex concepts like self-attention and autoregressive decoding explained?
Complex topics are broken down through visualizations, interactive examples, and analogies. For example, self-attention and autoregressive decoding are explained step-by-step, using diagrams that “unroll” the process for easy understanding.
Will I be able to access this workshop on my mobile or tablet?.
Yes, the course is optimized for access across multiple devices, including mobile, tablet, and desktop, ensuring flexibility to learn on the go.
Is there a certificate upon completion of the course?
Yes, you can get a certificate by sending us a message.
Can I ask questions during the course?
Yes, you can ask questions in the comments section of each lesson, and our team will respond as quickly as possible. You can also ask us questions anytime through the community driven Discord channel.
Can I download the course videos?
No, the course videos cannot be downloaded, but they can be accessed online at any time.
What is the price of the course?
The course is currently priced at $197 USD.
How is this workshop different from other content available on LLMs?
this workshop on Large Language Models (LLMs) stands out from others by delivering not only the foundational understanding of LLMs but also diving into practical, real-world applications tailored to industry-specific challenges. We focus on case studies, interactive labs, and personalized support that go beyond theoretical knowledge, ensuring you’re equipped to implement LLMs effectively in your own work environment. By the end of this workshop, you'll gain actionable insights and hands-on skills that are immediately applicable, setting you apart in a rapidly evolving tech landscape.