Pretraining LLMs: data, objectives and compute budget
When you talk to Claude or GPT-4, you are interacting with a model that went through two distinct phases. The first — and by far the most expensive — is pretraining: exposing the model to trillions of tokens of text and teaching it one simple task. The second phase (alignment, covered in the next lesson) turns that raw capability into a helpful assistant.
Content is available with subscription.
Get full access to all courses on the platform for one year with a single payment.
▼
Unlike other platforms that charge per course, here you get everything for one price, and after one year of use there will be no automatic charge for the following year.