CS365: Assignment-2

submitted by

Shubham Gupta 10699

A. k-NN Based Classifier

Graph of 'percentage error on test set' vs 'k' value (image)
No. of training data: 3000
No. of test data: 1000
It can be observed that error os minimum for 4 or 3 nearest neighbours.It decreases till around 4 and then it again starts increasing as we increase k further.


B. Manifold based modeling of MNIST digits

Isomap,short for Isometric feature mapping, is an algorithm for manifold learning. It is low-dimensional embedding of a set of high-dimensional data points. This algorithm is implemented in following way:
1.Connect each point with its K-nn neighbours.These connections are represented as a weighted graph with edges as the distances(euclidean/tangent).
2.Geodesic distances are calculated using shortest path algorithms.
3.Classical MDS(Multidimensional spacing) is used to embed data in a lower dimensional space with same interpoint distances as in original space.

B.1 Isomap using Euclidean distance

a)Cluster for digits 1 and 7(image)
b)Cluster for digits 4 and 9(image)
c)Cluster for all the digits(image)

OBSERVATIONS

a.It is observed that the clusters of 1 and 7 are seperated satisfactorily and thus euclidean distance could be used to distinguish between 1 and 7 handwritten numerals.
b.It is observed that the clusters of 4 and 9 cannot be distinguished properly.Hence it is not very effective in distinguishing between 4 and 9 handwritten numerals.Reason behind this is that 4 and 9 are written in more or less the same manner
c.All the digits form clusters in different portions of graph depending on their features.Based on the difference in appearances of different digits, different clusters are sepearted or merged.

B.2 Isomap using Tangent distance

a)Cluster for digits 1 and 7(image)
b)Cluster for digits 4 and 9(image)
c)Cluster for all the digits(image)

OBSERVATIONS

a.Digits 1 and 7 form well seperated groups in the space.Hence tangent distances can be effectively used to distinguish between 1 and 7 handwritten numerals.
b.Groups of digits 4 and 9 are also well seperated groups in the space.Hence 4 and 9 can be effectively distinguished using tangent distances.
c.Different digits form groups in different portions in space.Relative positions of clusters depends on the similarity and differences between different digits.
Overall, it can be observed that tangent distances groups together the images of handwritten numerals in a more efficient way.It can be argued that it a more effective way for recognizing the digits as compared to the euclidean distances.

C. Deep Learning

Procedure
RBMs(Restricted Boltzmann machines) can be stacked and trained in a greedy manner to form so-called Deep Belief Networks(DBN).This methodlearns to extract a deep hierarchial representation of the training data.It implements a grredy unsupervised layerwise training method for updating the parameters.
Each layer has certain number of hidden units(sigmoid functions).Each unit is connected to every unit of its adjacent hidden layer through wieghted edges. Training is done layerwise.Output of the first layer being the input for second and so on.Parameters are tuned based on the error in output w.r.t to desired results.
This DBN is unfolded into a simple NN with output layer corresponding to the 10 digits. This NN is then trained using the gradient descent algorithm for updating weights. The final updated weights are then tested on the MNIST testdata set.
All these processes are carried out using the inbulit functions in Deep Learning toolbox.


Results (Table)
Observations
1.Percentage Error decreases with the learning rate of NN upto a certain level(in this case around 5) and then again starts increasing.
2.No. of epochs improves the error but after a certain number it has no considerable effect.
3.Decreasing Batchsize of NN improves the efficiency ot the NN but it increaes the runtime.
4.Nothing much could be argued about the effect of architecture on the error percentage.But it increases the complexity and runtime significantly.
5.Increasing the alpha of DBN has rather a negligible effect on error.
6.Batchsize of DBN is kept fairly constant as changing it was having a negative impact on efficiency.

Codes for all the parts can be found here