CS 685: Data Mining


Bachelor-level databases, algorithms and statistics courses or consent of instructor

Course Contents:

We are witnessing an unprecedented growth in the amount of data, starting from protein sequences and structures to biomedical images, sensor readings and chemical data. In order to render this vast amount of data more useful than just a digital data storage structure, the ability to mine for knowledge inherent in the collection must be supported. This course will cover the standard algorithms for such data mining techniques. Special emphasis will be given on the recent trends in mining text data, mining graphs, mining spatio-temporal data, etc.

Besides the lectures by the instructor, the students will be asked to coordinate a dis-cussion about a recent paper in the class. They will also be required to complete a group project that will provide them with a hands-on experience on working with the techniques taught in the class.

Course Content:

  1. Knowledge mining from databases
  2. Data pre-processing
  3. Multi-dimensional data modeling
  4. Classification and prediction
  5. Clustering
  6. Frequent itemset mining
  7. Anomaly detection
  8. Mining special kinds of data including text and graph

Books and References:

The material will be mostly taken from various journal articles and conference proceedings available online. Reference books include "Data Mining: Concepts and Techniques" by Jiawei Han and Micheline Kamber, Elsevier, 2006.