- Coding Videos/
- Python /
- Machine Learning HandsOn Tutorial With Python And R-Part-#12-ZyKite Technologies

# Machine Learning HandsOn Tutorial With Python And R-Part-#12-ZyKite Technologies

### Download Video link >

Machine Learning HandsOn Tutorial With Python And R-Part-#12-ZyKite Technologies

In This Video You Will Learn About DataScience And Machine Learning With Python And R From Beginner To Advanced Level Knowledge.

We Taught Each Concept In Detailed Manner. Here You Will Get Chapter Wise In-depth Knowledge About DataScience And Machine Learning With Python And R.

Please Follow This Tutorial. Finally You Will Get More Confidence In DataScience And Machine Learning.

If You Want More IT Related Technologies Tutorials Subscribe To My Channel.

source

### Related Posts

- Machine Learning HandsOn Tutorial With Python And R-Part-#5-ZyKite Technologies
Machine Learning HandsOn Tutorial With Python And R-Part-#5-ZyKite Technologies In This Video You Will Learn…

- Machine Learning HandsOn Tutorial With Python And R-Part-#3-ZyKite Technologies
Machine Learning HandsOn Tutorial With Python And R-Part-#3-ZyKite Technologies In This Video You Will Learn…

- Machine Learning HandsOn Tutorial With Python And R-Part-#2-ZyKite Technologies
Machine Learning HandsOn Tutorial With Python And R-Part-#2-ZyKite Technologies In This Video You Will Learn…

- Machine Learning HandsOn Tutorial With Python And R-Part-#19-ZyKite Technologies
Machine Learning HandsOn Tutorial With Python And R-Part-#19-ZyKite Technologies In This Video You Will Learn…

- Machine Learning HandsOn Tutorial With Python And R-Part-#18-ZyKite Technologies
Machine Learning HandsOn Tutorial With Python And R-Part-#18-ZyKite Technologies In This Video You Will Learn…

### View Comments source >

### Transcript view all >

00:00 all right let's do this logistic

00:02 regression intuition and you can

00:04 probably already tell by my voice that

00:06 I'm pretty excited there's some very

00:08 interesting slides coming up and this is

00:10 quite an important topic but at the same

00:12 time it is quite challenging so quick

00:14 heads up there will be some math and

00:16 I've done a few run throughs of this

00:19 presentation already and I really I will

00:23 try my best to convey everything in the

00:25 simplest way possible so let's get into

00:27 it we already know about the linear

00:30 regression we know that there's a simple

00:32 linear regression and it has this very

00:33 short formula with one independent

00:35 variable and we also have looked into

00:38 the multiple linear regression which has

00:40 many independent variables so we already

00:44 know how to deal with this type of

00:47 challenge so when we have a scatterplot

00:49 like that where on the horizontal axis

00:51 we've got the independent variable on

00:53 the vertical axis we've got the

00:54 dependent variable and this is an

00:56 example we looked at salary versus

00:58 experience how do we create a model here

01:03 so we use a simple linear regression it

01:06 puts a line through our data and that

01:08 line is modeling our observations so we

01:12 can basically forecast things and

01:14 compare our actual blue serrations to

01:17 our model and so on but so we know how

01:20 to deal with challenges like that or

01:21 problems like that but your company that

01:24 hired you as a data scientist what they

01:27 do is they send out email offers to

01:29 customers with like a proposal to buy

01:32 certain products it might be a clothing

01:36 store it might be a grocery store or

01:38 something like that so what they do is

01:40 basically they send out a offer in the

01:43 in email to a lot of customers to

01:47 purchase certain products and here

01:49 you've got a sample of those customers

01:51 that they contacted recently you've got

01:52 their age and also you have a variable

01:56 whether they not they took action so did

01:59 the person take up an action perform an

02:01 action did they take up an offer do they

02:03 buy a product did they open up an email

02:05 respond to our email and so on so was an

02:07 action taken or not and the very

02:09 black-and-white very different but at

02:12 the same time like

02:13 no we don't know what to do we don't

02:15 know what's going on here it's it's not

02:16 what we're expecting but at the same

02:17 time intuitively we can see that there

02:20 is some sort of correlation we can see

02:22 that the observations on the bottom

02:25 there are a bit more to the left

02:26 observations on the top or a bit more to

02:28 the right implying kind of that probably

02:31 older people are more likely to take

02:34 action based on this offer and younger

02:37 people are more likely to ignore it so

02:40 can we somehow model this how about we

02:43 try our the existing method in our

02:46 toolkit which is the linear regression

02:49 let's run a linear regression and that's

02:51 what it looks like as you can tell it

02:53 doesn't look like the best approach

02:55 doesn't look like the best method to

02:57 solve this problem

02:59 so let's look into this in a bit more

03:02 detail we're going to draw the other

03:08 horizontal line over here instead of

03:11 trying to predict exactly what's going

03:14 to happen for for any given person let's

03:16 imagine a person and let's say we want

03:18 to predict for that person knowing their

03:20 age we want to predict whether they will

03:23 take up the offer or not but instead of

03:26 predicting exactly whether they're going

03:28 to take it up or not how about instead

03:30 we will predict the probability we will

03:35 state a probability or a likelihood of

03:37 that person taking up that offer and if

03:40 you think of it in that way right away

03:42 things start becoming clearer right away

03:45 you can see that okay so this chart is

03:48 actually from 0 to 1 and I also know

03:50 that probabilities are from 0 to 1 oh

03:52 that's interesting so basically I could

03:55 fit in probabilities between 0 & 1 the

03:57 fact that the red dots and the red

03:59 observations are already either 0 1 and

04:01 nowhere in between well that's simply

04:03 because we already know the result we

04:05 already know that they're either there

04:06 or there but for something are we

04:08 predicting it kind of makes sense to say

04:11 well how I don't know for sure I don't

04:14 know 100% he'll take it up or not but I

04:16 know maybe maybe over an 80% chance

04:18 he'll take it up or not right and when

04:21 you think of it that way the linear

04:22 regression line or at least that part

04:24 that's in the middle

04:26 between zero and one it makes sense

04:27 right well it may make some sense

04:29 because that is basically it's telling

04:33 you that anybody between those ages of

04:35 for instance where it's crossing the

04:37 horizontal line for the first time it

04:39 might be where it's crossing the

04:41 horizontal axis it might be like 25 or

04:43 let's say 35 and we're crossing the

04:46 vertical the horizontal axis for one it

04:49 might be let's say 55 so those people

04:52 between 35 and 55 they anything in

04:57 between any person that falls in between

04:59 that age there is a probability of them

05:02 taking up this offer and that

05:03 probabilities is increasing as we move

05:07 to the right as we take more and more

05:09 older people that probabilities

05:10 increasing so the part of the linear

05:12 regression in the middle kind of makes

05:14 sense and we can do something with it

05:16 but the parts that don't make sense at

05:18 all are the ones at the top at the

05:20 bottom because a probability can never

05:22 be less than zero I can never be above

05:25 one so what is the linear regression

05:26 trying to give us a hint about here well

05:29 what it's probably saying what we could

05:31 interpret it as is that people above

05:35 that age that nominal age we said 55

05:38 above that age they they are very very

05:42 likely to take off or actually more than

05:43 more than a hundred percent so basically

05:46 they're definitely taking it up anybody

05:48 below 35 on the other side on the Left

05:51 they're definitely not taking it so

05:54 essentially what we're saying is if we

05:56 ever take that approach then we would

05:57 have to replace this linear regression

05:59 line with a line that looks like that so

06:01 let's just cut those bits off and

06:03 replace them with horizontal parts and

06:05 that would be a very basic but it still

06:11 would be an attempt at creating a model

06:13 for this situation so we would still be

06:16 able to use this to make some sort of

06:17 predictions and assumptions that a lot

06:21 that talk about the correlation between

06:23 the action and the age of a person so

06:25 that's a very basic understanding and

06:28 that's kind of the star of our

06:31 understanding of intuition behind

06:33 logistic regression so let's see what

06:35 the actual scientific approach is so

06:39 here we've got

06:40 line that we looked at and it is

06:43 described by this equation now this part

06:46 is gonna be this is the most fun part so

06:48 bear with me if you apply to this

06:50 equation a sigmoid function which looks

06:53 like that

06:54 so you put the Y into the sigmoid

06:57 function in purple and then you solve

07:01 for y from the purple box and you put Y

07:04 back into the blue box then you will get

07:07 the green box so basically your linear

07:11 regression will start to look like this

07:12 and this is the formula for logistic

07:15 regression and what that will do to your

07:17 chart which is most importantly this

07:20 visual part it will convert it from the

07:24 chart that we see at the top to this new

07:26 chart which is actually the logistic

07:30 regression function so if at this stage

07:34 you're asking yourself what just

07:37 happened then you're not alone the first

07:41 time I saw this or I I learned this this

07:44 was the expression in my face if you if

07:47 you Tony you if you tell you comes from

07:49 of all that that's super great that

07:50 means you'll fly through this section

07:52 but if you're confused right now

07:54 not a problem I was the same when I was

07:56 in your shoes so let's take this step by

08:00 step let's look at it step by step

08:01 exactly what happened so there's our

08:04 graph there's our independent variable

08:08 there's our outcome yes or no so that's

08:12 the Y the dependent variable and there

08:15 are our observations in our data set

08:18 based on these observations and plus

08:20 using this formula which we're going to

08:24 take as given this is the logistic

08:26 regression formula using this formula

08:29 and these observations we come up with

08:32 this line and what is important to

08:35 understand here it's not a magical line

08:37 this line for the logistic regression is

08:40 the same as a slope or a trend line for

08:44 a linear regression so basically what

08:46 this line is doing is it is using the

08:50 formula it's following the formula and

08:52 it's the best

08:53 fitting line that can fit these datasets

08:56 so basically we're doing exactly the

08:58 same thing as with a linear regression

09:00 but it just looks different that's also

09:03 there there are heaps of these lines

09:05 that can you can draw that look like

09:06 that but only one of them is the best

09:09 fitting line so the point of the

09:11 logistic regression is to find that best

09:13 fitting line and this is it so we found

09:16 the best fitting line that follows that

09:19 equation and fits these variables that

09:23 we or these observations that we had in

09:25 our data set after that we can forget

09:28 about equation we forget about the

09:30 variables we've got our line so this is

09:32 our logistic regression function we

09:35 found it same thing as of the linear

09:37 regression we've created the model we've

09:38 built the model you can see it this is

09:40 the model in front of you right there

09:42 now what can we do with this logistic

09:45 regression well we can use it to predict

09:48 probabilities and we've already touched

09:51 on probabilities that they lie between

09:53 zero and one and that instead of

09:55 predicting for sure that something will

09:57 and will not happen how about we predict

09:58 probability so let's look at obe eigth

10:02 uh way probability here is called P hat

10:05 so that that's a little sign about the P

10:08 it gives it the name P hat and anything

10:11 you see in the hats in this section just

10:14 basically means that it's a something

10:16 we're predicting so and that's that's a

10:19 way to remember it that that picture P

10:21 hat so we're predicting this probability

10:24 okay so let's take four random values

10:28 for the independent variable for X we're

10:31 going to say 20 30 40 50 let's see what

10:33 happens to these variables so let's put

10:36 them on the X line those are the dots

10:38 and I specifically put dots not X's or

10:41 crosses because it doesn't mean that

10:44 they're on the horizontal the bottom

10:45 line doesn't mean that their probability

10:47 is zero or that they're dependent

10:51 variable is zero no they're just there

10:53 because they're on the x-axis we just

10:55 plotted them there it has nothing to do

10:56 with vertical axis now

10:59 let's what you need to do to find the

11:02 probabilities is you need to project

11:03 these values onto your

11:05 once you project them you get these blue

11:09 blue blue dots or blue observations

11:13 which plotted basically so these are the

11:15 fitted values as you remember and gretel

11:17 you have in red you have the actual and

11:19 in blue you have the fitted values so

11:21 these are your fitted values and now if

11:24 you project them if you want the

11:26 probabilities you need to project them

11:28 to the left like that and let's have a

11:30 look at these probabilities so the

11:32 person who's twenty years old the

11:34 probability of taking up this offer is

11:35 very low for perhaps 0.7% so less than a

11:39 1 percentage to take up this offer the

11:41 person who's 30 years old the percent of

11:44 the probability is higher it's about 23%

11:46 to take up this offer the person who is

11:48 40 years old their probability to take

11:51 up this offer is 85% according to this

11:53 model and the person who's 50 is old

11:55 their probability is 99.4% so that's the

11:59 first thing that you can get out of a

12:02 logistic regression that's what we're

12:03 going to be using very we're gonna be

12:06 using it very actively when we're

12:08 talking about building jo-jo my graphic

12:11 segmentations because you use this

12:12 probability as a score and I will talk

12:15 about this more so you can actually rank

12:17 people who is the most likely to take up

12:20 your friend who's the least likely to

12:21 take your fur up so it's actually even

12:24 better than just having a 1 or a 0 you

12:26 have a probability so you can order

12:29 people by this probability

12:30 anyway the you might want to say well I

12:34 don't want the probability I want a

12:35 prediction as because this is a

12:37 regression

12:39 I want a prediction for the the Y value

12:43 so ok we can do that can we get let's

12:47 get rid of those probabilities now can

12:50 we get the Y the actual well obviously

12:52 we can't get the actual because the

12:54 actual is something that we can only

12:56 observe in you know data set or in real

12:58 life we can only get a prediction for

13:00 the actual so y hat as it has suggest is

13:04 the predicted value for the dependent

13:07 variable how did you get Y hat well the

13:11 approach is a very arbitrary you have to

13:13 select the line let's wait for that ok

13:16 so you have to select a line in this

13:17 case we're going to take 50

13:18 percent you can select it anywhere but

13:21 50% is usually selected because it's in

13:23 the middle and it's therefore you have

13:25 symmetry and anything below this line so

13:28 anything that falls on the curve below

13:30 this line will be projected downwards

13:32 onto the zero line which which makes

13:35 sense so it's basically saying if your

13:37 probability your predicted probability

13:40 of taking up this offer is less than 50%

13:42 let's say it's 40% or 20% then we're

13:44 just gonna say that you're not you're

13:46 probably not gonna take up this offer

13:47 and so that's what's happening and the

13:50 person with 0.7% the person with

13:52 whatever it was 27% 23% their predicted

13:58 their probabilities are not zero but

14:00 they're below 50 so you are if you're if

14:04 you're if you do require a y hat so a

14:06 predicted value a yes/no value then make

14:11 sense that if something is below 50%

14:12 you're probably gonna say that they're

14:14 not gonna take up the offer now anything

14:16 up yet so there you go both of them Y

14:20 hats are zero now anything above the

14:22 horizontal line that we've selected the

14:24 50% line it is agreed that all of those

14:29 values that fall onto the curve above

14:32 that line are projected upwards they're

14:34 projected onto the yes line the one

14:37 won't line so the person that had a

14:40 probability of 85% is projected outwards

14:43 and the person that had the probability

14:45 of 99.7% is projected upwards also makes

14:47 sense right so if somebody's got a child

14:51 you're predicting that somebody's

14:52 probability of taking up an offer is 85%

14:54 then if you have to say yes or no then

14:57 you're probably gonna say yes you're

14:58 gonna say yes this person will take up

15:00 the offer if you just if you have to

15:02 choose one of the two so those are our

15:06 predicted Y hat values if in this case

15:10 they're both one for those two variables

15:12 and those are the two things you can get

15:16 out of the logistic regression so you

15:18 get the probabilities which are

15:21 important also you can get the Y hat so

15:23 the predicted values for the dependent

15:24 variables once again it's important to

15:26 think of it as it's doing exactly the

15:29 same thing as a linear regression it's

15:33 it's fitting this line even though it's

15:35 not a straight line and and and the

15:38 values are not scattered everything

15:41 looks bizarre in its uniformity or in

15:45 the way in its structure its structure

15:47 makes it look very bizarre but still

15:50 it's it's pretty much the same way we've

15:53 agreed on a line or a formula for a

15:56 curve and we're trying to fit the best

15:58 curve to our data once we've done that

16:00 we've got we've got a model we've got

16:03 coefficients which we'll talk about

16:04 later and we can start drawing

16:07 conclusions or insights from this model

16:10 and some of the insights are we can get

16:13 a probability of somebody taking action

16:16 or of the event occurring and or

16:19 basically of the answer being yes so

16:21 it's not a yes/no it's a probability so

16:23 85% or 20% or whatever so that's when we

16:26 project it to the left onto the y-axis

16:28 and also we can get a predicted value

16:31 for the dependent variable based on

16:33 where we select this arbitrary line 50%

16:36 you can select it anywhere you like you

16:38 can select higher lower depends on your

16:41 knowledge about the problem at hand and

16:44 as you can understand depending on where

16:46 you select it that will significantly

16:47 affect your variables so tic regression

16:51 manages to separate some categories and

16:53 predict a binary outcome so let's start

16:56 making the model right now and to start

16:59 we need to set the working directory so

17:02 we have to go to file explorer here and

17:04 then go to our machine learning AC

17:07 folder and part three classification and

17:09 then logistic regression all right

17:12 that's the right folder make sure that

17:14 you have your data set social network

17:16 adds dot CSV and then don't forget to

17:18 click on this little button here to set

17:20 the folder as working directory all

17:23 right and now the real first step of

17:25 making a machine learning model is to

17:27 pre-process the data and to do this of

17:29 course we're going to use the data

17:31 pre-processing template we've been

17:32 preparing in part one so we're going to

17:34 copy all this copy and paste it here so

17:39 now we just need to change a few things

17:41 first let's import the three essential

17:43 libraries

17:45 here we go and then here we need to

17:47 change the name of the dataset here it's

17:49 of course social underscore Network

17:54 underscore add and now let's select this

17:57 line to have a look at our data set

18:00 so execute perfect now let's go to

18:03 variable Explorer to have a look at our

18:05 dataset by double-clicking on it here

18:07 and here is the data set okay so just a

18:10 quick reminder this dataset contains

18:12 informations of users in a social

18:14 network

18:15 so those informations are the user ID

18:17 the gender the age and the estimated

18:20 salary and this social network has

18:22 several business clients which can put

18:25 their ads on the social network and one

18:27 of their clients is a car company who

18:29 has just launched their brand new luxury

18:32 SUV for a ridiculous price and we're

18:36 trying to see which of these users of

18:38 the social network are going to buy this

18:40 brand new SUV and so the last column

18:43 here tells if yes or no the user but

18:46 this SUV we are going to build a model

18:50 that is going to predict if a user is

18:52 going to buy or not the SUV based on two

18:55 variables which are going to be the age

18:57 and the estimated salary so our matrix a

19:00 feature is only going to be these two

19:02 columns we want to find some

19:04 correlations between the age and the

19:06 estimated salary of a user and his

19:08 decision to purchase yes or no the SUV

19:11 okay so what are the indexes this is the

19:15 index 0 index 1 index 2 and X 3 so here

19:20 we're gonna put our indexes we're gonna

19:23 put our two indexes in brackets so it's

19:25 2 comma and 3 so that's the two indexes

19:29 of the columns that we want to include

19:33 in our matrix of features ok and what

19:36 about Y let's see why okay so this was

19:39 indexed to index 3 and index 4 so here Y

19:42 is gonna take the index 4 alright and

19:46 now we're ready let's check if

19:48 everything's okay let's select this come

19:51 in a ctrl + Enter to execute and here we

19:54 go

19:54 x and y appear here so let's look at

19:57 them

19:58 X okay this is clearly the age here and

20:01 the estimated salary for all of our 400

20:05 observations perfect okay and why

20:09 okay perfect why is the purchase column

20:12 zero means no they user didn't buy the

20:14 car and one means yes

20:16 all right now let's split the data set

20:20 into the training set and the test set

20:22 so which test side would you like to

20:24 choose we have four hundred observations

20:27 so a good test size would be like to

20:29 have 300 observations in the training

20:31 set and a hundred observation in the

20:33 test set is that cool okay so let's do

20:36 this that means that we want to take

20:38 0.25 that is a hundred observations

20:41 let's select this come in a ctrl + Enter

20:46 to execute all right so we have now our

20:49 extra in X test y train and white test

20:52 and you can see that we have 300

20:54 observation in the training set here and

20:56 a hundred observation in the test set

20:59 okay great and now do we need to apply

21:02 future scaling well yes we're gonna do

21:04 it because we want accurate prediction

21:06 we want to predict which users are going

21:09 to buy the SUV - you know Target dis

21:12 users as well as possible so I'm gonna

21:15 remove that multi-line comment here and

21:21 I think we're fine extra in the next

21:24 test are ready to be transformed to be

21:27 scaled so let's select this command ctrl

21:31 + Enter to execute okay perfect now

21:35 let's look at our training set and test

21:37 set so as you can see the X train is

21:40 well scaled and ok so that's 4 okay so y

21:44 Train wasn't scaled obviously because

21:45 this is the category called dependent

21:47 variable and X test scale based on the

21:50 same scale because you know we fitted

21:52 the scx standard scaler object to our

21:56 matrix of features X train and we used

21:58 this same object on next test so this

22:01 means that it scaled on the same basis

22:03 okay and white list of persons not

22:06 scales great so we're ready we

22:09 pre-processed our data correctly and now

22:11 we're ready to be

22:12 or logistic regression model I can't

22:14 wait to show you the results your

22:16 intuition is going to get even more

22:18 shaped by looking at the results because

22:21 everything is going to be raining set

22:22 and that's what we're going to do in

22:24 this tutorial so as usual let's import

22:27 the correct library for that job which

22:29 is going to be the linear model library

22:31 and why linear it's because the logistic

22:34 regression is the linear classifier

22:36 which means that here since we are in

22:38 two dimensions our two categories of

22:40 users are going to be separated by a

22:42 straight line so you're gonna see your

22:44 intuition of logistic regression is even

22:46 going to be better shaped when you will

22:48 find out about the graphic results so

22:50 let's import this library by typing from

22:53 SK learn that linear model that's the

22:58 name of the library here it is okay and

23:02 then import we're gonna import the right

23:05 class for it which is the logistic

23:09 regression class once again a very

23:11 intuitive name we like it and then as

23:14 usual we are going to create an object

23:16 from this class which is going to be our

23:18 classifier that we are going to fit on

23:21 our training set so let's do it I'm

23:23 going to create a new variable

23:25 classifier which is actually our

23:28 logistic regression object and to create

23:32 the subject I'm going to call the class

23:33 logistic regression and here we go the

23:42 logistic regression class has several

23:43 parameters we can check them by

23:45 inspecting the logistic regression class

23:47 we type command I to inspect the class

23:50 here we go as you can see there are

23:52 several parameters but we're not going

23:53 to use any of them we're only going to

23:56 use the random state parameter here just

23:58 to have the same result so here I'm

24:00 gonna input random state equals 0 so

24:05 that we all get the same results all

24:07 right so our logistic regression object

24:10 is ready and now I'm gonna take it

24:13 classifier and I'm going to fit it to

24:17 the training set and to do this I'm

24:19 gonna use this method here which is the

24:22 fit method to fit it to our training set

24:25 okay so enter fit and then we want to

24:28 fit it to our training set so here we

24:30 input X train and why train because

24:39 remember what this means is that we're

24:40 taking our logistic regression

24:42 classifier object we are pinning it to

24:45 the training set X train and Y train so

24:49 that our classifier learns the

24:50 correlations between X train and y train

24:53 then by learning those correlations you

24:55 will be able to use what he learned

24:57 to predict new observations and we're

25:00 going to test its predictive power on a

25:02 different set which is going to be the

25:04 test set and that's what we're going to

25:06 do in the next tutorial so I'm just

25:08 going to select here this command ctrl +

25:13 Enter to execute and here we go now our

25:16 logistic regression model is fitted to

25:18 the training set okay so that's perfect

25:21 our logistic regression model is fitted

25:23 to the training set and now in the next

25:25 tutorial we will be predicting the test

25:28 set results with this logistic

25:30 regression model

25:31 so I set results so it's going to be

25:34 very fast going to take us one line so

25:36 let's write it we're going to introduce

25:38 a new variable here that we're gonna

25:40 call Y underscore bread because Y pred

25:44 is going to be the vector of predictions

25:46 that means that it's going to be the

25:48 vector that gives the prediction of each

25:51 of the test set observation

25:53 okay so let's compute this vector simply

25:56 to compute it we're gonna take our

25:58 classifier that we created in the

26:01 previous tutorial the classifier that

26:03 was fitted to the training set and here

26:06 we're going to use another method of the

26:09 logistic regression class because the

26:11 classifier is an object of the logistic

26:13 regression class and this method is

26:15 going to be this time the predict method

26:18 very simply here it is predict and in

26:23 parentheses we have to input an argument

26:25 in your opinion what is this argument

26:28 going to be we are going to predict the

26:30 results of the X test observation so

26:33 here we have to input X test

26:36 okay and it's ready so let's select this

26:39 line press command ctrl + Enter to

26:42 execute and here we go our vector of

26:45 predictions wipe read is ready let's

26:47 have a look

26:48 so wipe red is just here and if we open

26:52 our test set you're gonna understand

26:54 what it does okay so remember this was

26:57 scale so we cannot interpret this but

27:00 here was the age column and here was the

27:02 salary column okay so let's take for

27:05 example the first user here we have a

27:07 zero which means that our logistic

27:10 regression classifier predicted that

27:12 this user didn't buy the SUV car however

27:14 if we look at the seventh observation

27:17 here so the seventh user here we have a

27:20 one which means that the logistic

27:22 regression classifier predicted that

27:24 this user but the SUV car

27:28 okay so let's close this and this and

27:30 that's it for this tutorial that's how

27:32 you predict the test set results with

27:34 our logistic regression classifier in

27:35 the next tutorial we will be evaluating

27:38 the performance of this logistic

27:40 regression classifier using the

27:42 confusion matrix so we're going to see

27:44 the number of correct predictions and

27:46 the number of incorrect predictions that

27:48 our logistic regression classifier made

27:51 and we predicted the test set results

27:53 and today we're going to evaluate if our

27:56 logistic regression model learns and

27:59 understood correctly the correlations in

28:01 a training set to see if it can make

28:03 powerful predictions on a new set

28:04 especially the test set so this

28:07 confusion matrix is going to contain the

28:09 correct predictions that our model made

28:11 on the test set as well as the incorrect

28:13 predictions all right so let's create

28:16 this confusion matrix once again it's

28:18 going to be very fast and as usual we

28:21 are going to import a tool that is going

28:23 to help us compute this confusion matrix

28:26 faster only this time it's not going to

28:28 be a class it's going to be a function

28:30 which we will import from the SK learn

28:33 that matrix library so let's do it let's

28:36 type from SK learn dot matrix import and

28:45 then we import our function so our

28:48 function is confusion matrix confusion

28:53 underscore matrix all right and you can

28:56 clearly make the distinction between a

28:58 class and a function because remember

29:00 the class contains capitals at the

29:03 beginning of the words so now we're

29:05 going to use this function to compute

29:06 the confusion matrix in just one line so

29:09 we're going to call a confusion matrix

29:10 cm which is a new variable but it's the

29:14 confusion matrix and then we're gonna

29:17 use our function confusion matrix that

29:20 we just imported there we go and then

29:25 we're gonna input the parameters all

29:26 right so let's inspect the confusion

29:28 matrix we press command I to have a look

29:32 at the info here we go so that's our

29:34 parameters let's see what parameters we

29:36 have to input so the first parameter is

29:38 Y true

29:40 so that

29:41 the correct values that's the real

29:42 values that's the values of your data

29:45 set that's actually what happened in

29:47 real life and a cool name to designate

29:49 it I like this name it's ground truth

29:52 ground truth means the real values that

29:55 happened in reality okay so our first

29:57 parameter is then why test because this

30:01 is the vector of the real values which

30:04 are telling if yes or no the user really

30:06 bought the SUV car and then what is the

30:10 second parameter the second parameter is

30:12 wipe read so this time it's the vector

30:15 of predictions the vector of predictions

30:17 that are logistic regression model

30:19 predicted so do you guess what this is

30:22 it's actually Y underscore pred and

30:26 that's it that's all we need to know I

30:28 know there there are several more

30:30 parameters or actually there is just one

30:32 parameter it's labels well it's fine

30:34 we're not going to use it we have

30:35 everything we need here so let's select

30:37 these two lines and execute all right

30:42 the confusion matrix is created now

30:44 let's have a look to have a look at it

30:46 we can go to the console here and type

30:49 cm and then press ENTER to have a look

30:52 at it okay so here's the matrix we will

30:55 analyze in much further details the

30:57 confusion matrix in the final section of

31:00 this part about evaluating the model

31:03 performance but so far I'm just going to

31:05 tell you that the two numbers here 65

31:07 and 24 are the correct predictions and

31:10 those two numbers here 8 and 3 are the

31:13 incorrect predictions so we can see that

31:15 we have quite a lot of correct

31:16 predictions that's good we have 65 plus

31:18 24 equals 89 correct predictions and we

31:23 have 8 plus 3 equals 11 incorrect

31:26 predictions so that's good that's a

31:29 first step into evaluating a model

31:31 performance but really the most

31:33 interesting way to do that to evaluate a

31:35 model performance is what we're going to

31:37 do in the next tutorial because we're

31:39 gonna have a graphic visualization of

31:42 our results and we will clearly see how

31:44 our classifier separates the two

31:46 categories that means that we will see

31:48 the decision boundary of the classifier

31:51 and therefore the decision regions that

31:54 means that we're going to

31:55 clearly see the region where the

31:56 classifier predicts that the user is

31:58 going to buy the product and the other

32:00 region where the classifier predicts

32:02 that the user is not going to buy the

32:03 products you'll see it's going to be

32:05 very fun it's going to be exciting

32:07 visual graphic results I can't wait to

32:09 show you this results we used the

32:12 confusion matrix to evaluate the

32:14 predictive power of our logistic

32:16 regression model that was actually quite

32:18 fine

32:19 we found 65 plus 24 equals 89 correct

32:22 predictions and 8 plus 3 equals 11

32:25 incorrect predictions but that's not

32:28 really the fun way of visualizing the

32:30 predictive power what is much more fun

32:33 and much more exciting is to look at the

32:35 results on a graph so that's what we're

32:37 going to do in this tutorial we are

32:38 going to make a graph where we will

32:40 clearly see the regions where our

32:42 logistic regression model predicts yes

32:44 the user is going to buy the product and

32:46 know the user is not going to buy the

32:48 product so let's make this graph so

32:52 unfortunately we cannot make this graph

32:54 in just one line or two lines I try to

32:57 make it the simplest as possible but by

33:00 doing that I ended up with a little more

33:02 than 10 lines of code so what I'm gonna

33:05 do I'm not going to write the whole code

33:07 right now because you might fall asleep

33:09 so what I'm gonna do is I'm going to

33:11 take the code I prepared to visualize

33:14 the training set results in my text

33:16 editor I'm going to copy it paste it

33:19 here then I will select this code and

33:21 execute it we will look at our results

33:23 interpret them and then for those of you

33:26 interested in how we made such a graph

33:28 at the end of the tutorial I will

33:29 explain how it's making the graph okay

33:32 so I copied it I'm gonna paste it now

33:35 here it is here is the code so that's 15

33:39 lines of code as you can see and so

33:41 since I can't wait to show you the

33:43 results we're gonna select the code now

33:45 and I'm gonna press command ctrl + Enter

33:49 to execute and we will see the whole

33:51 thing behind logistic regression model

33:54 we are gonna see the true results and

33:56 regions of predictions so let's do it -

33:59 command ctrl + Enter to execute and here

34:03 we go that's the graph so I'm going to

34:05 enlarge this

34:08 and now I'll explain what to interpret

34:10 okay so let's analyze this graph

34:13 step-by-step first let's focus on all

34:17 the points here we can see that we have

34:18 some red points and some green points so

34:22 all these points that we see on this

34:24 graph are the observation points of our

34:26 training set that is these are all the

34:29 users of the social network that were

34:32 selected to go to the training set and

34:34 each of these users here is

34:37 characterized by its age here on the

34:39 x-axis and it's estimated salary here on

34:43 the y-axis now we can see that there are

34:46 some red points here and some green

34:50 points here the red points are the

34:52 training set observations for which the

34:55 dependent variable purchased is equal to

34:57 zero and the green points are the

35:00 training set observations for which the

35:03 dependent variable purchased is equal to

35:05 one that means that the red points here

35:08 are the users who didn't buy the SUV and

35:11 the green points here are the users who

35:14 bought who actually bought the SUV so

35:16 now as a first step of analysis let's

35:19 give an interpretation of what we

35:22 observe here with these users okay so

35:24 first we can see that the users who are

35:27 young with a low estimated salary so

35:30 these users here actually didn't buy the

35:33 SUV because these points are the real

35:35 observation points their red and red

35:38 corresponds to zero here so that means

35:40 that all these points here are the users

35:42 who didn't buy the SUV the users in the

35:44 training set then if we look at the

35:47 users who are older and with a higher

35:50 estimated salary well we can see that

35:52 most of these users actually bought the

35:54 SUV and it actually makes sense because

35:56 the SUV is more like a family car and

35:59 therefore more interesting for these

36:01 older users here with a high estimated

36:03 salary besides we can also see that some

36:05 older people even with a low estimated

36:08 salary actually bought the SUV because

36:11 we can see that we have some green

36:12 points here that correspond to an age

36:15 above the average the average is here

36:18 but an estimated salary below the

36:20 average because the average is here okay

36:23 so these guys these older guys although

36:26 they have a low estimated salary

36:27 actually bought the SUV probably because

36:29 they've been saving up some money or

36:31 maybe they finished paying up their

36:32 mortgage

36:33 I don't know but what's for sure is that

36:34 they couldn't resist buying this very

36:37 cool luxury SUV offered at a

36:40 ridiculously low price and on the other

36:43 hand we can also see that there are some

36:45 young people here with a high estimated

36:48 salary who actually bought the SUV

36:51 you know maybe because it's a very cool

36:53 SUV and they want to impress their

36:55 friends and take them into road trips or

36:57 maybe they already have a family I don't

36:59 know anyway

37:00 they bought the SUV actually there are a

37:03 lot of buyers so this must be a very

37:04 cool and cheap SUV okay

37:07 and now what is the goal of

37:09 classification now we're talking machine

37:12 learning why are we making some

37:14 classifiers and what will classifiers

37:17 will do at least what are we trying to

37:19 make them do for this particular

37:20 business problem well the goal here is

37:23 to classify the right users into the

37:26 right categories that is we are trying

37:29 to make a classifier that will catch the

37:32 right users into the right category

37:34 which are yes they buy the SUV and no

37:37 they don't buy the SUV and we

37:39 represented the way our classifier

37:41 catches these users by plotting what I

37:43 called the prediction regions and so the

37:46 prediction regions are the two regions

37:48 that we see on this graph this red one

37:51 here and this green one here and the red

37:54 prediction region is the region where

37:56 our classifier catches all the users

37:58 that don't buy the SUV and the green

38:01 prediction region is the region where a

38:03 classifier catches all the users that

38:05 buy the SUV but be careful this is

38:08 according to the classifier that is for

38:11 each user of this red prediction region

38:13 here our logistic regression classifier

38:16 predicts that the user doesn't buy the

38:18 SUV and for each user of this green

38:21 prediction region here our classifier

38:24 will predict that the user buys the SUV

38:26 even if that's not the case in real life

38:29 that's just a prediction but that's what

38:31 our class

38:32 your believes will happen it is the

38:34 classifier prediction compared to the

38:38 truth here which is the point the point

38:40 is the truth and the region here is the

38:43 prediction so that makes an awesome tool

38:46 because for each new user of the social

38:49 network well our classifier a logistic

38:52 regression classifier will sell based on

38:54 its age and its estimated salary if this

38:57 user belongs to this red prediction

38:59 region here and therefore doesn't buy

39:01 the SUV or if this user belongs to this

39:04 green prediction region here and

39:05 therefore buys the SUV and that way this

39:09 business client car company can

39:10 substantially optimize their marketing

39:13 campaign by targeting the social network

39:16 ads to the users in the green region

39:17 because these are the users that are

39:20 predicted to buy the SUV according to

39:23 our classifier now the other very

39:26 important thing to understand is that

39:27 these are two prediction regions

39:30 separated by a straight line which is

39:33 the straight line here and the straight

39:35 line is called the prediction boundary

39:37 because it's the boundary between the

39:39 two prediction regions and the fact that

39:42 it's a straight line is not random it is

39:45 for a particular reason and that's the

39:47 thing very important to understand

39:49 because that's the essence of logistic

39:52 regression

39:52 if the prediction boundary is a straight

39:55 line here that's because our logistic

39:57 regression classifier is a linear

39:59 classifier that means that here since we

40:02 are in two dimensions you know because

40:04 we have two independent variable the age

40:06 and the estimated salary so we are in

40:07 two dimensions then since the logistic

40:10 regression classifier is a linear

40:11 classifier then the prediction boundary

40:14 separator here can only be a straight

40:17 line if we were in three dimensions then

40:20 it would be a straight plan separating

40:22 two spaces but here in two dimensions

40:25 it's a straight line and it will always

40:26 be a straight line if your classifier is

40:29 a linear classifier but you will see

40:31 later that when we build nonlinear

40:34 classifiers then the prediction boundary

40:36 separator won't be a straight line

40:38 anymore I won't tell you more right now

40:40 and I will let you wait for the surprise

40:42 so here we can clearly see that our

40:45 logistic regression classifier manages

40:47 to catch most of the users who didn't

40:50 buy the SUV in the red region here and

40:52 most of the users who bought the SUV in

40:56 the green region here so it actually did

40:58 a pretty good job however it seems to

41:01 have trouble catching some green users

41:03 here who in spite of their low salary

41:05 but the luxury SUV as well as those

41:09 other green users here who also bought

41:11 the luxury SUV because as you can see

41:14 these green points here and those here

41:16 are in the red region which is the

41:18 region where our classifier predicts

41:21 that the users don't buy the SUV and

41:24 those incorrect predictions are due

41:26 specifically to the fact that our

41:28 classifier is a linear classifier and

41:31 because our users are not linearly

41:34 distributed if they were linearly

41:36 distributed then we will have all the

41:38 green points here in this space and all

41:40 the red points here in the space and

41:42 then a linear classifier with a straight

41:44 line could perfectly separate all the

41:46 red points here and all the green points

41:48 here but here we have some rebellious

41:50 points who are not in the wanted linear

41:52 regions and because our classifier has a

41:54 linear straight line separator that's

41:57 why it has trouble catching those users

41:59 here and those here you can clearly see

42:02 that even if you try to rotate this

42:04 straight line here well you will always

42:07 have some green points in the wrong

42:09 category for example if we try to rotate

42:11 here this way like putting it down well

42:15 ok we will catch these green points here

42:17 in the right green region here but since

42:20 we rotated it down we will take more

42:22 green users here because this will go up

42:25 and more green users here will be in the

42:28 red region so that's the best separator

42:31 the logistic regression classifier could

42:33 find and it couldn't do better because

42:35 it can only be a straight line

42:37 separating these two regions because to

42:40 catch those users the green users here

42:42 and the green users here in the right

42:43 category that is the green region our

42:45 classifier would need to make some kind

42:47 of a curve here too you know

42:49 classify correctly those green users

42:52 here and here and place them in a green

42:53 region and that would

42:55 prevents our classifier from making this

42:56 incorrect predictions here because it is

42:59 a straight line with a curve here we

43:02 would catch all the red users properly

43:03 in the red region and all the green

43:05 users in the green region so that would

43:08 make an awesome classifier and you will

43:10 see how our nonlinear classifiers will

43:12 make a terrific job at doing this I

43:14 can't wait to show you this ok

43:17 and now eventually the last thing very

43:19 important to understand is that this is

43:21 the training set this is a training set

43:24 so that means that our classifier

43:26 learned how to classify based on these

43:28 informations here so I would hold my

43:30 breath a few more seconds until I find

43:32 out if our logistic regression

43:34 classifier can manage to make good

43:36 predictions of new observations that is

43:38 to classify new users into the right

43:40 regions which by the way are fixed

43:43 regions here because these are the

43:45 regions generated by the learning

43:46 experience of our logistic regression

43:48 classifier and therefore won't change if

43:51 we look at some new observations that is

43:53 new social network users and that's what

43:56 we are about to find out on the test set

43:58 so hold on

44:01 so what I'm gonna do now I'm going to

44:03 copy this because we don't want to

44:06 because we want to be efficient and

44:09 paste it here and now I'm just going to

44:12 replace training here by test alright

44:18 and here I just have to replace X train

44:22 by X test and Y train my Y test that's

44:28 all that's all I need to do oh and maybe

44:30 the title I'm gonna change the title of

44:31 the graph test okay so let's now see how

44:36 our logistic regression classifier

44:38 predicts the new observations on the

44:41 test set on which our model wasn't built

44:44 so command or control + enter to execute

44:47 and let's see all right not bad not bad

44:52 I'm going to enlarge this here we go so

44:56 this looks good this looks very good

44:59 these are the real results in red that

45:01 means the users that didn't buy the SUV

45:03 in reality and these are the customers

45:05 that bought it and we can see that the

45:07 prediction regions predict well those

45:10 real values because all those red points

45:12 here in the red region so that's the

45:14 correct predictions all those green

45:16 points here and the green region so

45:18 that's some other correct predictions

45:19 and of course since this is a linear

45:21 classifier the logistic regression makes

45:24 a few mistakes here but that's fine and

45:26 that's actually the incorrect

45:28 predictions we saw on the confusion

45:30 matrix remember there was 11 incorrect

45:33 predictions we can count them here this

45:35 green point should be here so it's 1

45:37 then 2 3 4 5 6 7 8 and then we have to

45:44 counter the red points in the green

45:46 region that was incorrectly predicted so

45:49 9 10 and 11 okay so yes that's the 11

45:54 incorrect predictions that we found in

45:56 the confusion matrix alright so that's

46:00 the first classification model we built

46:02 I hope you like the graph okay so

46:06 congratulations you implemented your

46:08 first classification model on Python in

46:10 the next tutorials we're going to do it

46:12 on our and now for those of you

46:14 who are interested in understanding how

46:16 the graph was made stay with me and I

46:19 will explain right now how it works okay

46:22 so the best way to explain it is to read

46:26 select this I'm gonna ctrl + Enter to

46:29 execute and here's the graph so what is

46:33 the idea how how did we plot those

46:35 prediction regions well the idea is

46:39 actually pretty cool because we took all

46:42 the pixels of this frame here that means

46:45 this pixel this pixel all the pixels but

46:48 actually with a 0.01 resolution so we

46:51 took all the pixel points of this

46:53 framework and we applied our classifier

46:55 on it and so you know it's like each of

46:58 these pixel point is a user of the

47:00 social network with a salary and the age

47:02 so it's like it's like an observation

47:05 only it's not their observations we have

47:07 in our data set it's a new observation

47:09 we create but we have to picture this

47:11 like a user in the social network with a

47:13 estimated sorry on the age so for each

47:16 of these new pixel points these new

47:19 pixel observation points we applied our

47:22 logistic regression model to predict if

47:25 this pixel observation point has value 0

47:28 or 1 and if we're classifier predicts 0

47:32 then it's going to colorize this pixel

47:34 points in red and if the classifier

47:37 predicts 1 it's going to colorize the

47:39 pixel point in green so by doing this on

47:43 all the pixel points in the frame here

47:46 it color eise's all the pixel points

47:49 that have the 0 predictions does your

47:52 operations here and all the pixel points

47:54 that have the 1 prediction and since the

47:57 logistic regression is a classifier the

48:00 limits between those sets of points is a

48:03 straight line because as I told earlier

48:05 the logistic regression is a linear

48:08 classifier which means it's a straight

48:10 line in two dimensions and now that you

48:13 understand that you will understand this

48:14 code very well ok so now let's go

48:19 through the lines one by one we start by

48:21 importing the listed column map class

48:23 that will help us colorize all the data

48:26 points

48:27 then we create this line to just create

48:29 some local variables X set so that we

48:31 can replace X train and Y train easily

48:34 by X test and Y test instead of having

48:36 to replace it everywhere because you

48:38 know we use X set several time so you

48:41 know there's just a shortcut to avoid

48:43 having to replace too many times the X

48:45 test and X train here okay here with

48:49 these two lines of code x1 and x2 equals

48:51 NP math grid we prepare the grids you

48:54 know with all the pixel points that I

48:57 just talked to you about so here as you

48:59 can see we take the minimum values of

49:02 the age values minus one because we

49:05 don't want our points to be squeezed on

49:07 the axis here and the maximum values of

49:09 the age you know to get the range of

49:11 pixels we want to include in the frame

49:14 and same for this the salary we're

49:17 taking the minimum salary minus 1 and

49:19 the maximum salary plus 1 to get all the

49:23 estimated salary range and minus 1 and

49:25 plus 1 so that the points are in

49:27 squeezed and here we are choosing the

49:28 steps Oh point 0 1 that means that there

49:31 is an O point 0 1 resolution so for

49:34 example if I had chosen 0.5 this

49:36 wouldn't have been that danced we would

49:39 actually see the pixel points but that's

49:41 better to take this resolution because

49:43 these are a nice prediction regions it's

49:44 like it's continuous okay and then with

49:48 this line of code that's where the whole

49:50 magic happens because this is where we

49:53 apply the classifier on all the pixel

49:55 observation points and by doing that it

49:59 colorized us all the red pixel points

50:01 and all the green pixel points and we

50:02 use this contour function to actually

50:05 make the contour between the two

50:07 prediction regions the red one and the

50:09 green one then as you can see here we

50:11 use the predict function of our

50:13 classifier the logistic regression

50:15 classifier to predict if each of the

50:18 pixel point belongs to class 0 or class

50:20 1 then if the pixel point belongs to

50:22 class 0 it will be colorized in red and

50:24 if the pixel point belongs to class one

50:26 it will be colorized in green ok so here

50:30 we plot the limits of X the age and y

50:32 the estimated salary then with this loop

50:35 here we plot all the data points that

50:37 are the the real values so all the red

50:40 here and the green data points so we're

50:42 using this PLT dot scatter which is what

50:45 we use from matplotlib to make a scatter

50:47 plot and then recently here we add the

50:49 title logistic regression training set

50:51 the X label age the Y they will

50:54 estimated salary we plot the legend to

50:57 specify that the red points correspond

50:59 to zero that is a user didn't buy and

51:00 the green point corresponds to one that

51:02 is the user but the product and PLC show

51:06 to display the graph and specified that

51:09 this is the end of the graph alright so

51:11 congratulations if you are still here

51:13 congratulations for having the curiosity

51:15 of wondering how we can build such a

51:18 plot so that's a lot of things to learn

51:20 in the second