Machine learning  is sub field of computer science (more particularly soft computing) that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from an example training set of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.

Netflix and YouTube makes recommendation using machine learning. When we use credit cards, to make a purchase, there’s fraud protection on them, that’s machine learning as well. The tools that we use everyday are built on top of machine learning.

For machine learning there’s a technique to look at data, to understand data, to be able to apply algorithms. So broadly there are 3 types of Machine Learning algorithms:

  1. Supervised Learning:

This algorithm consist of a target / outcome variable which is to be predicted from a given set of independent variables. Using these set of variables, we generate a function that map inputs to desired outputs. The training process continues until the model achieves a desired level of accuracy on the training data. Examples of Supervised Learning: Regression, Decision Tree, Random Forest, KNN, Logistic Regression etc.

  1. Unsupervised Learning:

 In this algorithm, we do not have any target or outcome variable to predict / estimate.  It is used for clustering population in different groups, which is widely used for segmenting customers in different groups for specific intervention. Examples of Unsupervised Learning: Apriori algorithm, K-means.

  1. Reinforcement Learning:

Using this algorithm, the machine is trained to make specific decisions. It works this way: the machine is exposed to an environment where it trains itself continually using trial and error. This machine learns from past experience and tries to capture the best possible knowledge to make accurate business decisions. Example of Reinforcement Learning: Markov Decision Process

LIST OF COMMON MACHINE LEARNING ALGORITHMS

  1. Naive Bayes Classifier Algorithm:

Naïve Bayes Classifier is amongst the most popular learning method grouped by similarities, that works on the popular Bayes Theorem of Probability- to build machine learning models particularly for disease prediction and document classification. It is a simple classification of words based on Bayes Probability Theorem for subjective analysis of content.

When to use the Machine Learning algorithm – Naïve Bayes Classifier?

  1. If you have a moderate or large training data set.
  2. If the instances have several attributes.
  3. Given the classification parameter, attributes which describe the instances should be conditionally independent.

Applications:

  1. Sentiment Analysis- It is used at Facebook to analyse status updates expressing positive or negative emotions.
  2. Document Categorization- Google uses document classification to index documents and find relevancy scores i.e. the PageRank. PageRank mechanism considers the pages marked as important in the databases that were parsed and classified using a document classification technique.
  3. Naïve Bayes Algorithm is also used for classifying news articles about Technology, Entertainment, Sports, Politics, etc.
  4. Email Spam Filtering-Google Mail uses Naïve Bayes algorithm to classify your emails as Spam or Not Spam
  1. Support Vector Machines:

Support Vector Machine is a supervised machine learning algorithm for classification or regression problems where the dataset teaches SVM about the classes so that SVM can classify any new data. It works by classifying the data into different classes by finding a line (hyperplane) which separates the training data set into classes. As there are many such linear hyperplanes, SVM algorithm tries to maximize the distance between the various classes that are involved and this is referred as margin maximization. If the line that maximizes the distance between the classes is identified, the probability to generalize well to unseen data is increased.

SVM’s are classified into two categories:

  • Linear SVM’s – In linear SVM’s the training data i.e. classifiers are separated by a hyperplane.
  • Non-Linear SVM’s- In non-linear SVM’s it is not possible to separate the training data using a hyperplane. For example, the training data for Face detection consists of group of images that are faces and another group of images that are not faces (in other words all other images in the world except faces). Under such conditions, the training data is too complex that it is impossible to find a representation for every feature vector. Separating the set of faces linearly from the set of non-face is a complex task.

Advantages of Using SVM

  • SVM offers best classification performance (accuracy) on the training data.
  • SVM renders more efficiency for correct classification of the future data.
  • The best thing about SVM is that it does not make any strong assumptions on data.
  • It does not over-fit the data.

Applications:

SVM is commonly used for stock market forecasting by various financial institutions. For instance, it can be used to compare the relative performance of the stocks when compared to performance of other stocks in the same sector. The relative comparison of stocks helps manage investment making decisions based on the classifications made by the SVM learning algorithm.

  1. Decision Tree Classifier:

It is a type of supervised learning algorithm that is mostly used for classification problems. Surprisingly, it works for both categorical and continuous dependent variables. In this algorithm, we split the population into two or more homogeneous sets. This is done based on most significant attributes/ independent variables to make as distinct groups as possible.

When to use Decision Tree Machine Learning Algorithm

  • Decision trees are robust to errors and if the training data contains errors- decision tree algorithms will be best suited to address such problems.
  • Decision trees are best suited for problems where instances are represented by attribute value pairs.
  • If the training data has missing value then decision trees can be used, as they can handle missing values nicely by looking at the data in other columns.
  • Decision trees are best suited when the target function has discrete output values.

Applications of Decision Tree Machine Learning Algorithm

  • Decision trees are among the popular machine learning algorithms that find great use in finance for option pricing.
  • Remote sensing is an application area for pattern recognition based on decision trees.
  • used by banks to classify loan applicants by their probability of defaulting payments.
  • Gerber Products, a popular baby product company, used decision tree machine learning algorithm to decide whether they should continue using the plastic PVC (Poly Vinyl Chloride) in their products.
  • Rush University Medical Centre has developed a tool named Guardian that uses a decision tree machine learning algorithm to identify at-risk patients and disease trends.

 

  1. K-means Clustering Algorithms:

K-means is a popularly used unsupervised machine learning algorithm for cluster analysis. K-Means is a non-deterministic and iterative method. The algorithm operates on a given data set through pre-defined number of clusters, k. The output of K Means algorithm is k clusters with input data partitioned among the clusters.

Advantages of using K-Means Clustering Machine Learning Algorithm

  • In case of globular clusters, K-Means produces tighter clusters than hierarchical clustering.
  • Given a smaller value of K, K-Means clustering computes faster than hierarchical clustering for large number of variables.

Applications:

K Means Clustering algorithm is used by most of the search engines like Yahoo, Google to cluster web pages by similarity and identify the ‘relevance rate’ of search results. This helps search engines reduce the computational time for the users.

  1. Linear Regression:

Linear Regression algorithm shows the relationship between 2 variables and how the change in one variable impacts the other. The algorithm shows the impact on the dependent variable on changing the independent variable. The independent variables are referred as explanatory variables, as they explain the factors the impact the dependent variable. Dependent variable is often referred to as the factor of interest or predictor.

Advantages of Linear Regression Machine Learning Algorithm

  • It is one of the most interpretable machine learning algorithms, making it easy to explain to others.
  • It is easy of use as it requires minimal tuning.
  • It is the mostly widely used machine learning technique that runs fast.

Applications of Linear Regression

  • Estimating Sales: Linear Regression finds great use in business, for sales forecasting based on the trends. If a company observes steady increase in sales every month – a linear regression analysis of the monthly sales data helps the company forecast sales in upcoming months.
  • Risk Assessment

 

  1. Logistic Regression:

The name of this algorithm could be a little confusing in the sense that Logistic Regression machine learning algorithm is for classification tasks and not regression problems. The name ‘Regression’ here implies that a linear model is fit into the feature space. This algorithm applies a logistic function to a linear combination of features to predict the outcome of a categorical dependent variable based on predictor variables.

The odds or probabilities that describe the outcome of a single trial are modelled as a function of explanatory variables. Logistic regression algorithms helps estimate the probability of falling into a specific level of the categorical dependent variable based on the given predictor variables.

Just suppose that you want to predict if there will be a snowfall tomorrow in New York. Here the outcome of the prediction is not a continuous number because there will either be snowfall or no snowfall and hence linear regression cannot be applied. Here the outcome variable is one of the several categories and using logistic regression helps.

Based on the nature of categorical response, logistic regression is classified into 3 types –

  • Binary Logistic Regression– The most commonly used logistic regression when the categorical response has 2 possible outcomes i.e. either yes or not. Example –Predicting whether a student will pass or fail an exam, predicting whether a student will have low or high blood pressure, predicting whether a tumour is cancerous or not.
  • Multi-nominal Logistic Regression– Categorical response has 3 or more possible outcomes with no ordering. Example- Predicting what kind of search engine (Yahoo, Bing, Google, and MSN) is used by majority of US citizens.
  • Ordinal Logistic Regression– Categorical response has 3 or more possible outcomes with natural ordering. Example- How a customer rates the service and quality of food at a restaurant based on a scale of 1 to 10.
  1. Random Forest Machine Learning:

Random Forest is a trademark term for an ensemble of decision trees. In Random Forest, we’ve collection of decision trees (so known as “Forest”). To classify a new object based on attributes, each tree gives a classification and we say the tree “votes” for that class. The forest chooses the classification having the most votes (over all the trees in the forest).

Applications of Random Forest Machine Learning Algorithms

  • Random Forest algorithms are used by banks to predict if a loan applicant is a likely high risk.
  • They are used in the automobile industry to predict the failure or breakdown of a mechanical part.
  • These algorithms are used in the healthcare industry to predict if a patient is likely to develop a chronic disease or not.
  • They can also be used for regression tasks like predicting the average number of social media shares and performance scores.
  • Recently, the algorithm has also made way into predicting patterns in speech recognition software and classifying images and texts.

 

  1. Apriori Algorithm:

Apriori algorithm is an unsupervised machine learning algorithm that generates association rules from a given data set. Association rule implies that if an item A occurs, then item B also occurs with a certain probability. Most of the association rules generated are in the IF_THEN format. For example, IF people buy an iPad THEN they also buy an iPad Case to protect it. For the algorithm to derive such conclusions, it first observes the number of people who bought an iPad case while purchasing an iPad. This way a ratio is derived like out of the 100 people who purchased an iPad, 85 people also purchased an iPad case.

Basic principle on which Apriori Machine Learning Algorithm works:

  • If an item set occurs frequently then all the subsets of the item set, also occur frequently.
  • If an item set occurs infrequently then all the supersets of the item set have infrequent occurrence.

     Applications:

     Google auto-complete – when the user types a word, the search engine looks for other associated words that people usually type after a specific word.

  1. Artificial Neural Networks

An artificial neural network (ANN) learning algorithm, usually called “neural network” (NN), is a learning algorithm that is inspired by the structure and functional aspects of biological neural networks. Computations are structured in terms of an interconnected group of artificial neurons, processing information using a connectionist approach to computation. Modern neural networks are non-linear statistical data modelling tools. They are usually used to model complex relationships between inputs and outputs, to find patterns in data, or to capture the statistical structure in an unknown joint probability distribution between observed variables.

Applications:

  • Function approximation, orregression analysis, including time series prediction, fitness approximation and modeling.
  • Classification, includingpattern and sequence recognition, novelty detection and sequential decision making.
  • Data processing, including filtering, clustering,blind source separation and compression.
  • Robotics, including directing manipulators,prosthesis.
  • Control, includingcomputer numerical control.
  • Artificial neural networks have also been used to diagnose several cancers

 

  1. Random Forest

Random Forest is a trademark term for an ensemble of decision trees. In Random Forest, we’ve collection of decision trees (so known as “Forest”). To classify a new object based on attributes, each tree gives a classification and we say the tree “votes” for that class. The forest chooses the classification having the most votes (over all the trees in the forest).

Applications of Random Forest Machine Learning Algorithms

  • Random Forest algorithms are used by banks to predict if a loan applicant is a likely high risk.
  • They are used in the automobile industry to predict the failure or breakdown of a mechanical part.
  • These algorithms are used in the healthcare industry to predict if a patient is likely to develop a chronic disease or not.
  • They can also be used for regression tasks like predicting the average number of social media shares and performance scores.
  • Recently, the algorithm has also made way into predicting patterns in speech recognition software and classifying images and texts.
  1. K-Nearest Neighbor Algorithm

It can be used for both classification and regression problems. However, it is more widely used in classification problems in the industry. K nearest neighbors is a simple algorithm that stores all available cases and classifies new cases by a majority vote of its k neighbors. The case being assigned to the class is most common amongst its K nearest neighbors measured by a distance function.

These distance functions can be Euclidean, Manhattan, Minkowski and Hamming distance. First three functions are used for continuous function and fourth one (Hamming) for categorical variables. If K = 1, then the case is simply assigned to the class of its nearest neighbor. At times, choosing K turns out to be a challenge while performing KNN modeling.

Things to consider before selecting KNN:

  • KNN is computationally expensive
  • Variables should be normalized else higher range variables can bias it
  • Works on pre-processing stage more before going for KNN like outlier, noise removal

Applications:

  • breast cancer diagnosis problem
  • Forecasting stock market: Predict the price of a a stock, on the basis of company performance measures and economic data
  • Understanding and managing financial risks
  • Loan management