Assignment 1


1. K-Nearest Neighbours

This algorithm classifies objects based on their nearest neighbours. In the nearmost k neighbours, the most common class is calculated and then assigned to the object. This scheme has a very general thought foundation ie. it's algorithm appears obvious to the common man.





Observations
As the value of k increases, the error in classification initially decreases then increases generally apart from a few exceptions.
The best value of k , when done for k varying from 1 to 50 on a dataset of 60000 MNIST images, is observed to be 3.
The error percentage at k=3 is observed to be 2.83.



2. ISOMAP


Brief
Isomap firstly finds the distance along a curve or manifold, i.e. the geodesic distance, between points given in the input. This is done using the shortest path distances on the k-nearest neighbour graph of the data.
Then MDS is used to find points in Euclidean Space which is of lower dimensions, whose distance matches the geodesic distances found earlier.

Euclidean Distance 1 and 7
Tangential Distance 1 and 7
Euclidean Distance 4 and 9
Tangential Distance 4 and 9
Euclidean Distance all digits
Tangential Distance all digits



Key Observation from Isomap

Tangent Distance is a better distance metric than the Euclidean distance metric. The clusters are denser in the tangent distance metric case as evident from the above figures.

3.Deep Learning
Deep learning is based on learning several levels of representations,and the learning time of model depends on various parameters. The ones which I have considered are : learning rate, epoch, batch size,dbn size and alpha. The highest accuracy I could come up with corresponded to an error of 2.05 percent. Here is a link to my observation Table for deep learning.



Observation

  • Increasing no. of epochs decreases the error of DBN.
  • There is no direct relation between Alpha and performance of the network.
  • Incresing the number of layers in the network may not be always helpful.
  • Learning rate and batch size appear to be optimum at around 5 and 50 respectively.