LLM pretraining: data, stages, compute budget

Pretraining LLMs: data, objectives and compute budget

When you talk to Claude or GPT-4, you are interacting with a model that went through two distinct phases. The first — and by far the most expensive — is pretraining: exposing the model to trillions of tokens of text and teaching it one simple task. The second phase (alignment, covered in the next lesson) turns that raw capability into a helpful assistant.

Content is available with subscription.

Get full access to all courses on the platform for one year with a single payment.

Unlike other platforms that charge per course, here you get everything for one price, and after one year of use there will be no automatic charge for the following year.