You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have a linear equation $y = ax + b$ that we need to ajust to a set of training data $T$ so it minimises a loss function $L$. For this, we'll choose $L (y, \hat y) = {1 \over |T|} \sum_{y \in T}(y - \hat y)^2$ where $\hat y$ is the expected output. Since the loss function is convex (ie. $L''(x) \geq 0$), we can apply the conventional gradient descent algorithm without worrying about local minimas. But before, we'll have to figure out $L (y, \hat y) \over da$ and $\mathcal L (y, \hat y) \over db$ in order to figure out for each step of the training which way we need to move our parameters.