COMPUTER SCIENCE AND ENGINEERING DEPARTMENT

IIT Kanpur

 

CS 973: Machine Learning for Cyber Security 

 

 

Instructor: 

Dr. Sandeep K. Shukla
 Computer Science and Engineering   Department

 

 

Major, Measurable Learning Objectives

 

Having successfully completed this course, the student will be able to:

 

  1. Articulate and explain which problems in Cyber Security may be solvable with Machine Learning
  2. Understand and implement machine learning algorithms and models for Cyber Security problems such as malware analysis, intrusion detection, spam filtering, fraud detection, online behavior analysis etc
  3. Get basic hands on experience with supervised, unsupervised learning methods 
  4. Understand basic theory of classification and regression techniques
  5. Understand feature extraction from data 
  6. Develop tools for cyber defense using machine learning 

 

 

  • Prerequisites and Co-requisites

 

Must have completed the course on Introduction to Linear Algebra and have basic familiarity with probability theory. 

 

  • Texts and Special Teaching Aids

 

 There is no specific text. Course notes, lecture notes, and projects will be available on the course website. 

 

  • Syllabus

                                                                                                

            Here is a tentative syllabus for the course -- but this is not set in stone. Some topics may be excluded, and some other topics may be included depending on the progress of the course. 

 

  1. Data processing, cleaning, visualization, and exploratory analysis
  2. Data set collection and feature extraction
  3. Cyber   Security problems that can be solved using Machine learning
  4. Malware Analysis, Intrusion Detection, Spam detection, Phishing detection, Financial Fraud detection, Denial of Service Detection
  5. Basic Probability theory and Distributions
  6. Estimation Theory, Hypothesis testing 
  7. Linear Regression (uni- and multi-variate)  and Logistic Regression
  8. Basic Classification Techniques
  9. Unsupervised Learning
  10. Supervised Learning
  11. Spectral Embedding, Manifold detection and Anomaly Detection
  12. Decision Trees
  13. Ensemble learning
  14. Random Forest 

 

Lecture Plan
 

Module

Topic

No. of Hours

Introduction

Cyber Security Problems and Machine Learning Based Solutions

1

Basic Probability and Distributions 

Binomial, Poisson, Normal, Exponential, other distributions 

 

1

Estimation Theory and Hypothesis Testing

Sampling, Estimation, Hypothesis testing

2

Regression 

Uni- and multi-variate regression, logistic regression

2

Supervised Learning

 

Linear Classifiers, Decision Trees, Ensemble Learning, Random Forest

 

3

Unsupervised Learning

Clustering, Manifold Discovery, Diffusion map, spectral embedding, Anomaly detection, outliers

5

Malware Analysis

Static and Dynamic Analysis, Models that work well

2

Spam/Phishing Detection 

Training Models and Measuring Efficacy

1

Intrusion Detection

Network Intrusion Detection 

1

Fraud Detection

Machine Learning Models for Outlier detection

1

DDoS Detection

Models with Statistical regression combined with distance metric

1

Total

 

20