Machine Learning

Posts

Showing posts from May, 2023

Voting Classifier

Voting Classifier In voting classifier we select some models.Suppose you select Logistic Regression,SVM,KNN Always remember no.of models you select will always odd because we use majority count Suppose a new data point cames and Outputs are 1.Logististic Regression->1 2.SVM->0 3.KNN->1 Here you can see no.of model said 1(2)>no.of model said 0(1)-.So ours model says 1 Here you can also use this logic in multiclass classification

Ensemble Learning

Ensemble learning Ensemble learning is a technique where we merges model power Experiment A man came with animal and collect public and said that if anyone tell exact weight of this animal I will give price. Every one says wrong weight but man take mean of public guesses And man get surprised because it is exactly animal's weight Its explain power 💪 of crowd Technique 1. Voting 2. Bagging 3. Boosting 4. Stacking

Softmax Regression

Softmax Regression Softmax Regression is used to classify more than two class Loss function will be same

Logistic regression code

Logistic Regression Code from sklearn.linear_model import LogisticRegression LR = LogisticRegression () LR.fit(X_train,y_train) LR.score(X_test,y_test)

Log loss

Log loss log loss is a loss for classification it is for logistic regression Loss for logistic regression

Logistic regression

Logistic Regression It is used for classification just like linear regression logistic regression also classify by line

Regularization code

Ridge Regression from sklearn.linear_model import Ridge # Importing Ridge ridge= Ridge() ridge. fit(X_train,y_train) ridge. score(X_test,y_test) ridge. predict(new_df) Lasso regression from sklearn.linear_model import Lasso lasso= Lasso () lasso. fit(X_train,y_train) lasso. score(X_test,y_test) lasso. predict(new_df) ElasticNet regression from sklearn.linear_model import ElasticNet elastic=ElasticNet() elastic. fit(X_train,y_train) elastic. score(X_test,y_test) elastic. predict(new_df) NOTE-> 入 is set by alpha param

ElasticNet Regression

ElasticNet Regression L1 norm and L2 norm are not for very big data because you can't know you have to use L1 norm or L2 norm.This problem solved by ElasticNet regression In this case there are λ in hyperparmeters

L1 norm V/S L2 norm

L1 norm V/S L2 norm L2 norm is used when you known each and every column is well correlated to output L1 norm is used when you known some columns are useful and some not NOTE-> But they are not for very big data because you can't know you have to use L1 norm or L2 norm.This problem solved by elastic regression

Mathematics of L1 norm

Mathematics of L1 norm Let's gets start with "What we have to do?" In such case of simple linear regression(OLS).b=y_mean-m*x_mean and m=∑(Xi-X_mean)(yi-y_mean)/ ∑(Xi-X_mean) 2 In such as case of lasso we derive every things again but we don't derive b because in lasso we reduces m not b will same as well. 1. Choose a loss function.In this case I choose L=𝜮(yi-p) 2 2. Calculate dL/dm= ∑(Xi-X_mean)(yi-y_mean)+λ/ ∑(Xi-X_mean) 2 to minimize loss Then m= ∑(Xi-X_mean)(yi-y_mean)+λ/ ∑(Xi-X_mean) 2 And in this case you clearly see if we increase λ m decrease but it effects b also but not that much And λ is hyperparameter in whole Regularization

Lasso Regression(L1 norm)

Lasso Regression(L1 norm) Also like L2 norm L1 norm is also technique of Regularization In this technique, regularization term is λ|m| In, next class we discuss about Mathematics of L1 norm

Mathematics of Ridge regression

Mathematics of Ridge regression Let's get started with "What do we have to do?" In such case of simple linear regression(OLS).b=y_mean-m*x_mean and m=∑(Xi-X_mean)(yi-y_mean)/ ∑(Xi-X_mean) 2 In such a case of ridge we derive everything again but we don't derive b because in ridge we reduce m not b will be the same as well. 1. Choose a loss function.In this case I choose L=𝜮(yi-p) 2 2. Calculate dL/dm= ∑(Xi-X_mean)(yi-y_mean)/ ∑(Xi-X_mean) 2 +λ to minimize loss Then m= ∑(Xi-X_mean)(yi-y_mean)/ ∑(Xi-X_mean) 2 +λ And in this case, you clearly see if we increase λ m decrease but it affects b also but not that much And λ is a hyperparameter in whole Regularization

Ridge Regression(L2 norm)

Ridge Regression In Ridge Regression regularization term is λm 2 So the loss function is ∑(y i -p) 2 +λm 2 In Next class, we discuss how ridge regression reduces overfitting

Regularization

Regularization Regularization is a technique helps to reduce overfitting.In this technique our bias increase little bit but variance improve so much. Working You know y=mx+b is line where m defines slope but in terms of machine learning it defines say/weightage of x in model. If m is very high it is overfitting or m is very low then model is underfitted . In summary we want to set correct value of m.In regularization we just decrease m. To do it add term called "regularization term" in our loss function There are 3 types of regularization Ridge Regression(L2) Lasso Regression(L1) ElasticNet Regression

Loss function

Loss function Loss function is Quantity helps to find loss of our model when ever it is greater models perform poor else model perform good. Today we learnt about Loss function in regression in other class we discus Loss function in classification There are Several Loss function in regression MSE(Mean Squared Error) MAE(Mean Absolute Error) RMSE(Root Mean Squared Error) R2 Score MSE MSE formula --> 𝛴 (y i -p) 2 Note - MSE is effected by outlier MAE MAE formula --> Σ | y i -p | RMSE RMSE formula --> √Σ (y i -p) 2 R2 Score R2 Score formula --> 1-RSS/TSS RSS=Sum of Square of Residual TSS=Total Sum of Square

Overfitting

Overfitting Overfitting is famous term in machine learning it says in our training data accuracy is good but in testing data and new data from user our model perform bad. Good model vs Overfit model result on training data In this thing, we have to study bias-variance tradeoff bias and variance is loss but bias is loss of training data where variance is loss of testing data they have an inverse relationship means- B is inversion proportion to the V In overfitting bias is very low but variance is very high Prevention If it is happening with you then simply use the ensemble techniques.

Gradient Descent Code

Gradient Descent Code from sklearn.linear_model import SGDRegressor # Importing SGDRegressor model=SDGRegressor() # Creating Instance of SDGRegressor model.fit(X_train,y_train) # Fitting train data for training model.score(X_test,y_test) # Checking r2 score model.predict(new_data) # Predict on new data that we get from user

Gradient descent for nD

Gradient Descent for nD As well as we know, gradient descent is optimization technique today we learn about y_hat=Bo+B1x1+B2x2+Bnxn You know how to apply Gradient Descent In like that we just have to apply gradient descent on Bo , BnXn

Gradient Descent for 2D data

Gradient Descent Gradient Descent is a optimization technique In previous class we study about OLS to find m and b and you know that the y=mx+b is line of linear regression Steps to apply Gradient Descent 1. Initialize value of m and b with random value 2. for i in epochs: a. gradient descent formula -> p=p-lr*dl/dp We have to apply these Steps for all the parameter. Always remember gradient is dl/dp. 🖂 EMAIL-->aarushdixit73@gmail.com

Linear Regression

Simple Linear Regression Linear regression is a machine learning algorithm it is for linear or sort of linear data it is a line y=mx+b and L = Σ(y i -y i_hat ) 2 But Simple linear regression can be applied on only 2d Data It means L is function of m and b and we have to calculate value of m and b There are two ways in which we can calculate m and b 1. Closed-form Solution(OLS) 2. Non-Closed-form Solution(Gradient Descent) Today, we are going to study OLS(Ordinary Least Square) method Firstly I will show you direct formula to calculate m and b m= Σ(xi-X_MEAN)(yi-Y_MEAN)/ Σ(xi-X_MEAN) 2 b=y_mean-m*x_mean Derivation We have to find those m and b value who can minimize loss so differentiation dl/db=y_mean-(m*x_mean) dl/dm = Σ(xi-X_MEAN)(yi-Y_MEAN)/ Σ(xi-X_MEAN) 2 🖂 EMAIL-->aarushdixit73@gmail.com