It is a strategy that is called “gradient descent,” which is simply used to find the values of a function’s parameters (coefficients) that minimize a cost function as far as possible. You start by defining the initial parameter’s values and from there gradient descent uses calculus to iteratively adjust the values so they minimize the given cost-function.So, we are trying to get the value of weight such that the error becomes minimum. Basically, we need to figure out whether we need to increase or decrease the weight value.
Once we know that, we keep on updating the weight value in that direction until error becomes minimum. You might reach a point, where if you further update the weight, the error will increase. At that time you need to stop, and that is your final weight value as shown in the figure. |
“In fact, the mere act of opening the box will determine the state of the cat, although in this case there were three determinate states the cat could be in: these being Alive, Dead, and Bloody Furious.” ― Terry Pratchett, Lords and Ladies |