Units: 3-0-0-0 (9)
Pre-requisites: CS220
The primary objective of the course is to discuss the principles and practices of the design of the contemporary multi-core and multiprocessor architectures.
This course studies the principles and practices of multi-core and multiprocessor design. It introduces students to the broad topics such as cache coherence, memory consistency models, synchronization primitives, on-chip interconnection networks, and performance pathologies of shared memory parallel programs.
Sl. No. | Broad Title | Topics | No. of Lectures (Each 75 minutes) |
1. | Introduction | Multi-cores: why and what; Moore’s law; Dennard scaling | 2 |
2. | Fundamentals of memory system | Virtual memory; address translation hardware; SRAM and caches; DRAM and main memory | 6 |
3. | Tools and techniques for evaluating architectures | Simulation; dynamic binary instrumentation; performance counters; use of special instructions such as cupid of x86 | 3 |
4. | Introduction to shared memory multiprocessors and multi-cores | Types of architectures; problem of cache coherence; specification of cache coherence protocols as a set of invariants; basics of memory consistency models | 5 |
5. | Shared memory synchronization | Hardware support for efficient synchronization; interplay of cache coherence, speculative execution, and synchronization primitives; implementation of efficient locks and barriers | 3 |
6. | Performance analysis of shared memory parallel programs | Brief introduction to shared memory parallel programming techniques: POSIX thread model, OpenMP, fork/mmap; performance pathologies of shared memory parallel programs; influence of cache coherence and synchronization | 3 |
7. | Scalable cache coherence | Directory-based coherence protocols and their implementation; case study of SGI Origin 2000 protocol | 3 |
8. | Memory consistency models | Sequential consistency, total store order, partial store order, processor consistency, weak ordering, release consistency | 2 |
9. | Interconnection networks | Topologies, integrated router design, routing techniques for networks on chip, interplay of deadlockfree routing and cache coherence; virtual channels | 2 |