Multiple Features:
Obviously, you just don't want to predict the house prices only based on its size, you want to include more variables or features, as we call them.
You might want to use the Number of bedrooms, number of floors, age of a home in years, to predict its price.
When using multiple features, we use some notation as follows:
n = The number of features in the dataset.
x(i) = the i-th point in the training example.
x(i)(j) = the value of the jth feature of the i-th point in the training set.
Hypothesis:
Previously we had only one variable, the size, now we have four, the bedrooms, floors, age, and the size, so the hypothesis equation changes accordingly.
It now becomes: h = theta0 + theta1* x1 + theta2*x2 + theta2*x3 + theta4 *x4.
With the last four terms in accordance with the four variables, we are now using. Thus the number of thetas now changes from two to five.
We, for convenience, create a new variable x0 which is always equal to 1. So that we can write the above equation as follows.
h = theta0 * x0 + theta1 * x1 + theta2 * x2 + theta3*x2 + theta4*x4
If we represent thetas with a single column matrix of size (5, 1). And similarly, X with a matrix, such that each row denotes an example point and column represents the features. The theta and the X matrix will look something like this:
theta = [theta0] x0 x1 x2 x3 x4
[theta1] X = [-----example set 1 -------]
[theta2] [-----example set 2 -------]
[theta3] [-----example set 3 -------]
[theta4] [-----example set 4 -------]
[-----example set 5 -------]
[-----example set 5 -------]
[-----example set 5 -------]
...... up to n examples
Similarly, we can represent h in a column vector H, where each row corresponds to the value of h for that numbered row in the training set.
As you would have guessed it, we can represent all the H, for all the points in one single matrix operation.
H = X* theta.
Here we are doing matrix multiplication, if you use python, you might want to replace * with @.
No comments:
Post a Comment