What are we looking for?

Data Science -> Artificial Intelligence
Throughout history, subjects like statistics, optimization, data mining, machine learning and artificial intelligence are connected to build the reality of bots to make life easier, comfortable and faster (data science fields). In artificial intelligence sub-field, computers need to know (receive & store) data from sensors through their networks to know their environment, learn and take actions to satisfy themselves/human !

Artificial Intelligence sub-fields

Data Science -> Artificial Intelligence -> Machine Learning
The way the computers run whole process need some sort of algorithm that helps to reasoning step by step (or in parallel). Procedures of algorithms are made by human but this could happen independently. The major procedures headings used in machine learning are listed below :

  • Supervised learning (task driven)
    • Regression (for continuous data)
      • Linear
      • Logistics
      • Generalized Linear Model (GLM)
      • Gaussian Process Regression (GPR)
      • Support Vector Regression (SVR)
      • Ensemble Methods
      • Decision Tree
    • Classification (for categorical data)
      • Navie Bayes
      • Support Vector Machine (SVM)
      • Random Decision Forest
      • Ada Boost
      • Gradient Boosted
      • Logistic
      • Nearest Neighbour (NN)
      • Discriminant Analysis
  • Unsupervised learning (data driven)
    • Clustering
      • K-means
      • k-Medoids
      • KNN
      • Hierarchial
      • Gaussian Mixture
      • Hidden Markov Model
    • Dimensionality Reduction
      • PCA
      • SVD
  • Semi-Supervised learning
    • Self Training
    • Low Density Separation Models
    • Graph based algorithms
  • Reinforcement learning (learn to react)
    • Dynamic Programming
    • Monte Carlo Methods
    • Heuristic Methods
Reinforcement Learning Algorithms
  • Self learning
  • Feature learning
  • Sparse dictionary learning
  • Anomaly detection (outlier detection)
  • Robot learning
  • Association rule learning

Running any procedures needs creating models to train, models could be nature inspired, imaginary thoughts, logical reasoning and etc. Some of them are mentioned in above lists and rest of them are :

  • Artificial Neural Networks (ANN)
    • Feedforward
    • Radial Basis
    • Kohonen
    • Learning vectors
    • Modular
    • Recurrent
      • Hopfield
      • Elman/Jordan
      • Echo State
      • LSTM
      • BRNN
      • CTRNN
      • HRNN
      • RMLP
      • Second Order
      • Multi Time
      • Stochastic
Famous Neural Networks Schematic Structure
  • genetic algorithm (GA)
  • Bayesian networks (belief networks)

For a clear look, famous algorithms used in machine learning are grouped under similarity features.

Machine Learning Algorithms

Let’s start coding!

Using written/embedded algorithms in packages and modules is easy-peasy problem-solving method but here we will write some algorithms from scratch and we could compare them with sklearn, Tensorflow and etc. versions. Python programming language will be used widely! (could use R, …). For any questions here, feel free to contact me.

  1. Linear Regression (python code : Link)
    Mathematical definition of line : Y=mX+b (m: Slope, b: Y Intercept)
    here, for regression we will use mean-squared distances of points :
    slope = ( ( mean(X) * mean(Y) – mean(X*Y) ) / ( mean(X)2-mean(X2) ) )
    Test the code with Origin, Excel and Open-Office Spreadsheet.
Regression Code Output : Y=1.248…X+0.3333… with R2=0.9081… (purple point is predicted using regression line)
  1. K-Nearest Neighbors (KNN) (python code : Link)
    euclidean distance : d(p,q)=sum((pi-qi)2)0.5
    here, for new data we calculate euclidean distances and count ‘K’ of nearest neighbors. Like election, number of nearest neighbor’s group indicates the winner (i.e. determines which group, our new data belongs to).
    For comparison, we plot center of masses and group boundary to see KNN results by changing ‘k’ number.
KNN algorithm result (group of new data changes with ‘k’ number)
  1. Support Vector Machine (SVM) (python code: Link)
    loss function : hinge loss (faster run to maximize the margin)
    if sign(predicted value) = sign(actual value) –> cost : 0
    w : normal vector to hyperplane
    x : set of points (samples and features)
    bias : hyper plane intersect (each distances)
    offset : SVMs distances from decision boundary (margins)
    margin : reinforcement range of values([-1,1])
    α : Learning rate
    λ : regularization parameter (to balance the margin maximization and loss)
    Iteration : repetition of process
    To get gradient update, we must take partial derivatives respectively, then we get: df/dwj ={0 : if yi(wT.xi+b)≥1 , -yi.xij if yi(wT.xi+b)<1}
    Min(λ||w||2) –> w=w-α.(2λw) , Max loss –>(-xi.yi)+
    misclassification gradient update : w -= α.(yi.xi-2λw)
    otherwise gradient update: w -= α.(2λw)
simple Support vector machine (SVM) using hinge loss function (Supervised classification)
  1. K-Means (python code : Link)
    Algorithm takes all observations into ‘k’ clusters with nearest mean filtration (i.e. all we need to specify is the number of clusters). First we try to tag points with k center and minimize the distances to centers until best centers satisfy the accuracy we need. Initial ‘k’ center points are first ‘k’ points.
    For prediction, frobenius norm aims to find distances from each centers.
K-Means algorithm clustering (unsupervised clustering)
  1. Mean-Shift (python code : Link)
    1st : we start to see others from every data (i.e. every data could be a cluster center).
    2nd : Mean of distances lower than radius is center of cluster.
    Finally by repeating 2 steps we get optimum centers.
Mean shift algorithm clustering (unsupervised clustering)

6.Recurrent Neural Networks (RNNs)
6.1. Multiple Timescales Recurrent Neural Network (MTRNN)
6.2 Long Short-Term Memory (LSTM)

7. Convolutional Neural Network (CNN)

Share this Page