Lecture | MWF: 10:00 AM - 10:50 AM (ENGR 301A / Online) |
Lecturer |
Joseph Picone, Professor Office: ENGR 718 Office Hours: (MWF) 08:00 AM - 10:00 AM, other times by appointment Phone: 215-204-4841 (desk); 215-954-7076 (cell - preferred) Email: picone@temple.edu Zoom: picone@temple.edu or joseph.picone@gmail.com |
Google Group URL:
https://groups.google.com/forum/#!forum/temple_engineering_ece4822
Google Group Email: temple-engineering-ece4822@googlegroups.com |
|
Website | http://www.isip.piconepress.com/courses/temple/ece_4822 |
Textbook / Online Resources |
Like many computer-related disciplines, there are an infinite
number of textbooks and web sites devoted to this topic.
Some of the resources we will make use of in this
course are:
|
Reference Textbooks |
This is one of the better books on parallel processing:
D.B. Kirk and W.W. Hwu
This book presents an excellent programmer's view of GPU programming:
Programming Massively Parallel Processors: A Hands-On Approach Morgan Kaufmann; Third Edition December 21, 2016, 576 pages ISBN: 978-0128119860 URL: Morgan Kaufmann; Amazon Link
J. Han and B. Sharma
Learn CUDA Programming: A beginner's guide to GPU programming and parallel computing with CUDA 10.x and C/C++ Packt Publishing; First Edition (September 27, 2019) 508 pages ISBN: 978-1788996242 URL: Packt Publishing Link; Amazon Link |
Other Resources |
Internet-based resources play a major role in this course. We will
make extensive use of the Linux operating system, but will only have
time to scratch the surface on this topic. An excellent in-depth
training course can be found here:
Free Linux Online Training:
a wide range of Linux tutorials are available.
Learning how to use the Internet to problem solve is another very
important skill you will learn in this course. We often describe
this as "learning how to learn." An amazing resource that contains
an answer to just about any computer question you can imagine is:
LearnPython.org: many excellent interactive tutorials.
Stack Overflow:
where you can find answers to almost any programming question.
|
Prerequisites | Minimum grade of C- in (ECE 3822 or CIS 2168) |
|
|
Programming Assignments (10) | 75% |
Final Exam | 25% |
TOTAL: | 100% |
|
|
|
|
|
|
General Computing: A Brief Introduction to Linux and Linux Clusters | slides | aux | code | lecture |
|
|
General Computing: A Brief Introduction into Supercomputing | slides | aux | code | lecture |
|
|
C/C++ Programming: Overview of HW #1 | slides | aux | code | lecture |
|
|
General Computing: A Brief History of Coprocessors | slides | aux | code | lecture |
|
|
C/C++ Programming: Pointers, Array Indexing and Compiler Optimizations | slides | aux | code | lecture |
|
|
Basic GPU Programming: The GPU Economy | slides | aux | code | lecture |
|
|
Basic GPU Programming: GPU Architectures | slides | aux | code | lecture |
|
|
DSP Review: Digital Filters and Factoring Computations | slides | aux | code | lecture |
|
|
DSP Review: Real-Time Systems | slides | aux | code | lecture |
|
|
DSP Review: Linear System Theory | slides | aux | code | lecture |
|
|
Basic GPU Programming: Host and Coprocessor Dynamics | slides | aux | code | lecture |
|
|
Basic GPU Programming: Blocks and Threads | slides | aux | code | lecture |
|
|
Basic GPU Programming: Cooperating Threads | slides | aux | code | lecture |
|
|
Basic GPU Programming: GPU Memory | slides | aux | code | lecture |
|
|
Basic CPU Programming: Virtual Memory | slides | aux | code | lecture |
|
|
Basic CPU Programming: Cache, Multithreading and Hyperthreading | slides | aux | code | lecture |
|
|
Basic GPU Programming: Using Multiple GPUs (Manual) | slides | aux | code | lecture |
|
|
Parallel Computing: Foundations | slides | aux | code | lecture |
|
|
Parallel Computing: Basic Parallel Programming Using OpenMP | slides | aux | code | lecture |
|
|
Parallel Computing: Shared Memory and Multithreading in OPenMP | slides | aux | code | lecture |
|
|
Python Programming: Interfaces to CUDA | slides | aux | code | lecture |
|
|
Python Programming: Creating Efficient Functions in Numba | slides | aux | code | lecture |
|
|
Parallel Computing: Alternatives to CUDA | slides | aux | code | lecture |
|
|
Advanced GPU Programming: Streams and Concurrency | slides | aux | code | lecture |
|
|
Advanced GPU Programming: CUDA BLAS | slides | aux | code | lecture |
|
|
Parallel Computing: Message Passing | slides | aux | code | lecture |
|
|
Parallel Computing: MapReduce and Data Parallelism | slides | aux | code | lecture |
|
|
General Computing: Error Correction Coding | slides | aux | code | lecture |
|
|
General Computing: Disks, File Systems and I/O | slides | aux | code | lecture |
|
|
Parallel Computing: Hadoop - A Distributed File System | slides | aux | code | lecture |
|
|
Final Exam: Digital Resampling of Signals | slides | aux | code | lecture |
|
|
Advanced GPU Programming: The Thrust C++ Library | slides | aux | code | lecture |
|
|
General Computing: Task Scheduling | slides | aux | code | lecture |
|
|
Applications: Introduction to Computer Graphics | slides | aux | code | lecture |
|
|
Applications: Ray Tracing in Computer Graphics | slides | aux | code | lecture |
|
|
Applications: Ray Tracing on a GPU | slides | aux | code | lecture |
|
|
Final Exam: A Deep Dive Into Image Processing | slides | aux | code | lecture |
|
|
Basic GPU Programming: The NVIDIA Blackwell Architecture | slides | aux | code | lecture |
|
|
Applications – Introduction to Neural Networks and Deep Learning | slides | aux | code | lecture |
|
|
Applications – Implementation of Deep Learning Systems Using PyTorch | slides | aux | code | lecture |
|
|
Applications – Introduction to Quantum Computing | slides | aux | code | lecture |
|
|
Final Exam (8:00 AM - 10:00 AM): Competition Deadline and Student Presentations | slides | aux | code | lecture |
|
|
|
|
|
Linear Algebra and DSP in C++ |
|
|
Fast Algorithms in C++ |
|
|
Numerical Libraries |
|
|
My First GPU Program |
|
|
Matrix Multiplication on a GPU Using CUDA |
|
|
Parallel DSP Filters Using OpenMP |
|
|
Parallel DSP on a GPU |
|
|
Managing GPU Resources |
|
|
Parallelizing Code Across Multiple GPUs |
|
|
Neural Networks and Dense Graphs |
|
|
No Late Homework Accepted After This Date |
|
|
Final Exam: Data Challenge |