Deep Learning for Computer Vision

Learning types:

  • Reinforcement learning - learn to select an action of maximum payoff
  • Supervised learning - given input, predict output. 2 types: regression (continuous values), classification (labels)
  • Unsupervised learning - discover internal representation of an input (also includes self-supervised learning)

Artificial neuron representation:

  1. x - input vector (“synapses”) is just the given features
  2. w - weight vector which regulates the importance of each input
  3. b - bias which adjusts the weighted values, i.e., shifts them
  4. z - net input vector wx+b which is linear combination of inputs
  5. g() - activation function through which the net input is passed to introduce non-linearity
  6. a - the activation vector g(z) which is the neuron output vector

Artificial neural network representation:

  1. Each neuron receives inputs from inputs neurons and sends activations to output neurons
  2. There are multiple neuron layers and the more there are, the more powerful the network is (usually)
  3. The weights learn to adapt to the required task to produce the best results based on some loss function

Popular activation functions - ReLU, sigmoid and softmax, the last two of which are mainly used in the last layer before the error function:

ReLU(x)=max(0,x) σ(x)=11+exp(x) softmax(x)=exp(x)xxexp(x)

Popular error functions - MSE (for regression), Cross Entropy (for classification):

MSE(y^,y)=1Nn=1N(yny^n)2 CE(y^,y)=n=1Nynlogy^n

Backpropagation - weight update algorithm during which the gradient of the error function with respect to the parameters wE is calculated to find the update direction such that the updated weights w iteratively would lead to minimized error value.

wwαwE(w)