Course Staff

Percy Liang



What is this course about?

Language models serve as the cornerstone of modern natural language processing (NLP) applications and open up a new paradigm of having a single general purpose system address a range of downstream tasks. As the field of artificial intelligence (AI), machine learning (ML), and NLP continues to grow, possessing a deep understanding of language models becomes essential for scientists and engineers alike. This course is designed to provide students with a comprehensive understanding of language models by walking them through the entire process of developing their own. Drawing inspiration from operating systems courses that create an entire operating system from scratch, we will lead students through every aspect of language model creation, including data collection and cleansing for pre-training, transformer model construction, model training, and evaluation before deployment.


Note that this is a 5-unit class. This is a very implementation-heavy class, so please allocate enough time for it.



All deadlines are listed in the schedule.

Honor code

Like all other classes at Stanford, we take the student Honor Code seriously. Please respect the following policies:

Submitting coursework

Late days

Regrade requests

If you believe that the course staff made an objective error in grading, you may submit a regrade request on Gradescope within 3 days after the grades are released.


We would like to thank Together AI for sponsoring the compute for this class.


Percy's lectures are all in Python and available at this repository.

# Date Description Course Materials Events Deadlines
1 Mon April 1 Overview, tokenization (Percy) Assignment 1 out
2 Wed April 3 Pytorch, resource accounting (Percy)
3 Mon April 8 Architectures, hyperparameters (Tatsu) lecture 3.pdf
4 Wed April 10 Mixture of experts (Tatsu) lecture 4.pdf
5 Mon April 15 GPUs (Tatsu) lecture 5.pdf Assignment 1 due
6 Wed April 17 Kernels, Triton (Percy) Assignment 2 out
7 Mon April 22 Parallelism (Tatsu) lecture 7.pdf
8 Wed April 24 Parallelism (Percy)
9 Mon April 29 Scaling laws (Tatsu) lecture 9.pdf
10 Wed May 1 Scaling laws (Tatsu) lecture 10.pdf Assignment 2 due
Assignment 3 out
11 Mon May 6 Data (Percy)
12 Wed May 8 Data (Percy) Assignment 3 due
Sat May 11 Assignment 4 out
13 Mon May 13 Data (Percy)
14 Wed May 15 Data (Percy)
15 Mon May 20 Alignment (Tatsu) lecture 15.pdf
16 Wed May 22 Alignment (Tatsu) lecture 16.pdf
- Mon May 27 Memorial Day - no classes Assignment 4 due
17 Wed May 29 Evals (Tatsu) lecture 17.pdf Assignment 5 out
18 Mon June 3 Guest lecture by Ce Zhang
19 Wed June 5 Guest lecture by Aakanksha Chowdhery