Weight initialization: why the start matters
A neural net is optimization in a space with millions of dimensions. Where you start matters. Zero weights: all neurons are identical, symmetry is not broken, the net does not learn. Too large: activations saturate, the gradient dies. Too small: activations collapse toward zero. Xavier and He fix this by design: they choose weight variance so activation variance stays stable through the network.
Content is available with subscription.
Get full access to all courses on the platform for one year with a single payment.
▼
Unlike other platforms that charge per course, here you get everything for one price, and after one year of use there will be no automatic charge for the following year.