In this project we propose a novel Building detection method in the terrain of IIT Kanpur. Training data is collected by segmenting recorded aerial videos of buildings of IIT Kanpur using front camera of AR Drone. We first employed scale invariant feature extraction from the training images using SIFT algorithm. Then bag of Visual Words is applied for representing each image in the form of a visual word and these words are then clustered using K-means algorithm. Finally One vs. One Binary SVM Classification algorithm is used for learning the classifier. We finally show how proposed method give interesting performances.
Image Source: http://cdn.arstechnica.net
Video Segmentation : The video is first segmented into various images at the frame rate of .5 (i.e. a image per 2 seconds of clip) using ffmpeg library.
Feature Extraction : Scale Invariant Feature Transform (SIFT) descriptor compute 'key points' and 'descriptors'. We employed dense sift rather than the sift described by Lowe's algorithm. For category recognition, it has been found that dense sift features give better result than sift features. [1][3][4]
Bag of Visual Words : Next step is to quantize descriptors into visual vocabulary. K-means clustering algorithm is employed and each cluster is assigned to words to obtain dictionary of k-visual words. These histograms are used for classification.[4]
Multi-Class SVM : The classifier here learns all the patterns and regularities in the input vectors. One-Vs-One classification is chosen which improves the speed
With successful training, classifier is able to classify a test sample into correct class with high accuracy .[3][4]
Given a test image, it's test score is calculated through which probabilities are assigned to each class and maximum out of them is selected. Method proposed in [2] is used which assigns probability to each class for given test score.
We experiment on the videos of H. R. Kadim Department of Computer Science and Engineering (CSE Department), P. K. Kellkar Library and Department of Industrial and Management Engineering (IME). Folllowing images show the score we got on successful classification.
We were successfully able to classify various buildings with reasonable success in detecting sides. For more information refer project report.
REPORT | CODE | DATASET(Video) |
PROPOSAL | POSTER | DATASET(Images) |
PRESENTATION |
NOTE:: Dataset comprises of aerial video/imagery of buildings at IIT Kanpur. We plan to increase our dataset soon.
[1] Distinctive Image Features from Scale-Invariant Keypoints DAVID G. LOWE 2004
[2] D. Price, S. Knerr, L. Personnaz, and G. Dreyfus. Pairwise nerual network classi?ers with probabilistic
outputs. In G. Tesauro, D. Touretzky, and T. Leen, The MIT Press, 1995.
[3] Moranduzzo, Thomas, and Farid Melgani. "A SIFT-SVM method for detecting cars in UAV images", 2012 IEEE International Geoscience and Remote Sensing Symposium, 2012
[4] Andrea Vedaldi and Andrew Zisserman Image Classification Practical, 2011