Home > Teaching > CS 423: Multi-core and Multiprocessor Architecture

CS 423: Multi-core and Multiprocessor Architecture

Units: 3-0-0-0 (9)

 

Pre-requisites: CS220

 

Objectives:

The primary objective of the course is to discuss the principles and practices of the design of the contemporary multi-core and multiprocessor architectures.

 

Summary:

This course studies the principles and practices of multi-core and multiprocessor design. It introduces students to the broad topics such as cache coherence, memory consistency models, synchronization primitives, on-chip interconnection networks, and performance pathologies of shared memory parallel programs.

 

Contents:
Sl. No. Broad Title Topics No. of Lectures
(Each 75 minutes)
1. Introduction Multi-cores: why and what; Moore’s law; Dennard scaling 2
2. Fundamentals of memory system Virtual memory; address translation hardware; SRAM and caches; DRAM and main memory 6
3. Tools and techniques for evaluating architectures Simulation; dynamic binary instrumentation; performance counters; use of special instructions such as cupid of x86 3
4. Introduction to shared memory multiprocessors and multi-cores Types of architectures; problem of cache coherence; specification of cache coherence protocols as a set of invariants; basics of memory consistency models 5
5. Shared memory synchronization Hardware support for efficient synchronization; interplay of cache coherence, speculative execution, and synchronization primitives; implementation of efficient locks and barriers 3
6. Performance analysis of shared memory parallel programs Brief introduction to shared memory parallel programming techniques: POSIX thread model, OpenMP, fork/mmap; performance pathologies of shared memory parallel programs; influence of cache coherence and synchronization 3
7. Scalable cache coherence Directory-based coherence protocols and their implementation; case study of SGI Origin 2000 protocol 3
8. Memory consistency models Sequential consistency, total store order, partial store order, processor consistency, weak ordering, release consistency 2
9. Interconnection networks Topologies, integrated router design, routing techniques for networks on chip, interplay of deadlockfree routing and cache coherence; virtual channels 2
 
References/Books:
  1.  D. E. Culler and J. P. Singh with A. Gupta. Parallel Computer Architecture: A Hardware/Software Approach. Morgan-Kaufmann publishers.
  2.  J. L. Hennessy and D. A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann/Elsevier-India.