CS 433: Parallel Programming

Units: 3-0-0-9
Pre-requisites: ESC101, CS210, CS220, CS330, CS335.

Course Contents:

  1. Introduction: Why parallel computing; Ubiquity of parallel hardware/multi-cores; Processes and threads; Programming models: shared memory and message passing; Speedup and efciency; Amdahls Law.

  2. Introduction to parallel hardware: Multi-cores and multiprocessors; shared memory and message passing architectures; cache hierarchy and coherence; sequential consistency.

  3. Introduction to parallel software: Steps involved in developing a parallel program; Depen- dence analysis; Domain decomposition; Task assignment: static and dynamic; Performance issues: 4C cache misses, inherent and artifactual communication, false sharing, computation-to-communication ratio as a guiding metric for decomposition, hot spots and staggered communication.

  4. Shared memory parallel programming: Synchronization: Locks and barriers; Hardware primitives for efcient lock implementation; Lock al- gorithms; Relaxed consistency models; High-level language memory models (such Java and/or C++); Memory fences. Developing parallel programs with UNIX fork model: IPC with shared memory and message pass- ing; UNIX semaphore and its all-or-none semantic. Example case studies (see note below for some details). Developing parallel programs with POSIX thread library: Thread creation; Thread join; Mutex; Condition variables. Example case studies (see note below for some details). Developing parallel programs with OpenMP directives: Parallel for; Parallel section; Static, dy- namic, guided, and runtime scheduling; Critical sections and atomic operations; Barriers; Reduction. Example case studies (see note below for some details).

  5. Message passing programming: Distributed memory model; Introduction to message passing interface (MPI); Synchronization as Send/Recv pair; Synchronous and asynchronous Send/Recv; Collective communication: Reduce, Broadcast, Data distribution, Scatter, Gather; MPI derived data types. Example case studies (see note below for some details).

  6. Introduction to GPU programming: GPU architecture; Introduction to CUDA programming; Concept of SIMD and SIMT computation; Thread blocks; Warps; Global memory; Shared memory; Thread divergence in control transfer; Example case studies (see note below for some details).

  7. Additional topics: PGAS and APGAS programming paradigms; Transactional memory paradigm; Introduction to speculative parallelization.

** Notes **:

  1. The example case studies should be chosen to cover a wide variety of parallel algorithms drawn from nu- meric as well as non-numeric domains. Possibilities include parallel sort, parallel prex, parallel search, graph algorithms, parallel ranking, reduction, algorithms using tree, fan, pipe paradigms, matrix computa- tion, equation solvers, n-body simulation, ray tracing, etc.
  2. The instruction must accompany an adequate number of programming assignments demonstrating the concepts.
  3. The instructors are encouraged to offer large semester-long programming projects.

Books and References:

  1. Peter S Pacheco, An Introduction to Parallel Programming, Morgan Kaufmann ,2011.
  2. M Herlihy and N Shavit The Art of Multiprocessor Programming Morgan Kaufmann, 2008.
  3. JL Hennessy and DA Patterson, Computer Architecture: A Quantitative Approach,4th Ed., Morgan Kaufmann/Els India, 2006.
  4. DE Culler and JP Singh with A Gupta, Parallel Computer Architecture: A Hardware/Software Approach Morgan-Kaufmann, 1998.
  5. A Grama, A Gupta, G Karypis, and V Kumar, Introduction to Parallel Computing. 2nd Ed., Addison-Wesley, 2003.
  6. MJ Quinn, Parallel Computing: Theory and Practice, Tata McGraw Hill, 2002.
  7. DB Kirk and W-m W Hwu. Programming Massively Parallel Processors, Morgan Kaufmann, 2010.