If a vector is a point's position, a matrix is a command that makes that point move. A matrix can rotate, stretch, squash, or warp all of space at once. In machine learning, a neural layer's weights are the numbers in a matrix. When data passes through the layer, the weight matrix literally warps space, trying to place points of different classes farther apart.
Linear transforms — what a matrix can do
Matrix multiplication A×B is not just arithmetic. It's applying two transforms in sequence: first B, then A on top. Hover over a cell in the result matrix C below to see which row of A and which column of B combine. Order matters — A×B is usually not equal to B×A.
| 2 | 3 | 1 |
| 0 | 4 | 2 |
| 1 | 2 |
| 0 | 1 |
| 3 | 0 |
| 5 | 7 |
| 6 | 4 |
Every dense layer is literally y=Wx+b. Matrix W warps space; bias b shifts it. Training means tuning the entries of W so that after all layers, points of different classes (e.g. cats vs dogs) end up far apart and easy to separate.
import numpy as np
W = np.array([[0, -1], [1, 0]])
x = np.array([1, 0])
y = W @ x # [0, 1]
inv_W = np.linalg.inv(W)
det = np.linalg.det(W) # = 1.0