Evaluating generative models: perplexity, BLEU and FID
Evaluating a generative model is far harder than evaluating a classifier. For classification there is one correct label; for generation there are infinitely many good outputs. "The sun shines" and "A bright sun is shining" are both perfect continuations of the same prompt — but a metric that demands an exact string match will fail the second one.
Content is available with subscription.
Get full access to all courses on the platform for one year with a single payment.
▼
Unlike other platforms that charge per course, here you get everything for one price, and after one year of use there will be no automatic charge for the following year.