Multifeature Linear Regression

the math: Logically same as univariate here, just instead of single ww and xx there are multiple. It isalso more efficient to express all weights and features as vectors w⃗\vec{w} and x⃗\vec{x} . This way their dot product ( w⃗\vec{w} ⋅\cdot x⃗\vec{x} )is efficiently handled. thus the equation becomes: fw⃗,b(x⃗(i))=w⃗⋅x⃗(i)+b f_{\vec{w},b}(\vec{x}^{(i)}) = \vec w\cdot \vec{x}^{(i)} + b And the cost function J(w,b)=12m∑i=0m−1(fw,b(x(i))−y(i))2J(w,b) = \frac{1}{2m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})^2 would be changed to: J(w⃗,b)=12m∑i=0m−1(fw⃗,b(x⃗(i))−y(i))2 J(\vec{w},b) = \frac{1}{2m} \sum\limits_{i = 0}^{m-1} (f_{\vec{w},b}(\vec{x}^{(i)}) - y^{(i)})^2 For gradient descent and the derivatives: ...

January 25, 2025 · 6 min

linear regression and gradient descent

This is a short summary of the first week of the machine learning course by Andrew Ng. First thing he covered was the difference between supervised and unsupervised learning, but I currently care about the former. supervised ml: Giving the computer a data set with sample answers of interest and telling it “find the correlation between the dataset and the answers of interest” or more simply, “learn how to get me the answers I care about given this dataset”. ...

January 20, 2025 · 6 min