Lecture | MWF: 10:00 - 10:50 AM (ENGR 308 / Online) |
Lecturer |
Joseph Picone, Professor Office: ENGR 718 Office Hours: (MWF) 08:00 AM - 10:00 AM Phone: 215-204-4841 (desk); 708-848-2846 (cell - preferred) Email: picone@temple.edu Zoom: picone@temple.edu or joseph.picone@gmail.com |
Google Group URL:
https://groups.google.com/g/temple_engineering_ece8527
Google Group Email: temple_engineering_ece8527@googlegroups.com |
|
Website | http://www.isip.piconepress.com/courses/temple/ece_8527 |
Required Textbooks |
I. Drori, The Science of Deep Learning, Cambridge University Press, New York, New York, USA, ISBN: 978-1-108-83508-4, pp. 339, 2023. URL: https://www.thescienceofdeeplearning.org/ A. Lindholm, N. Wahlstrom, F. Lindsten and T. Schon, Machine Learning: A First Course for Engineering and Scientists, Cambridge University Press, New York, New York, USA, ISBN: 978-1-108-84360-7, pp. 338, 2022. URL: http://smlbook.org/book/sml-book-draft-latest.pdf |
Reference Textbooks |
There are a lot of textbooks available online. Most modern textbooks focus on neural networks or deep learning. While these are important topics, our goal in this course is to give you a broad perspective of the field. Technology changes quickly, so you need to have a good background in the fundamentals, and need to understand how to do more than just "button push."
Those wishing to build their theoretical background in
this area will find this book useful:
|
Prerequisites | ECE 8527/5110: ENGR 5022 (minimum grade: B-) ENGR 5033 (minimum grade: B-) ECE 4527: ECE 3512 (minimum grade: C-) ECE 3522 (minimum grade: C-) |
Pattern recognition theory and practice is concerned with the
design, analysis, and development of methods for the
classification or description of patterns, objects, signals, and
processes. At the heart of this discipline is our ability infer
the statistical behavior of data from limited data sets, and to
assign data to classes based on generalized notions of distances
in a probabilistic space. Many commercial applications of pattern
recognition exist today, including voice recognition (e.g., Amazon
Alexa), fingerprint classification (e.g., MacBook Pro touch bar),
and retinal scanners (e.g., your favorite cheesy sci-fi movie).
Machine learning is a field that is at least 50 years old. Recent
advances in deep learning, starting around 2005, have
have revolutionized the field. Today, machine learning is one
of the most active areas of engineering and is enjoying
unprecedented levels of success. However, to understand why
these techniques work, we must build a background in traditional
pattern recognition and machine learning concepts such as
maximum likelihood decision theory, Bayesian estimation,
nonparametric methods such as decision trees and support
vector machines and temporal modeling approaches such as
hidden Markov models. This course is designed to give you
a strong background in fundamentals, yet also introduce you
to the tools necessary to implement these algorithms.
|
|
Exam No. 1 | 15% |
Exam No. 2 | 15% |
Exam No. 3 | 15% |
Final Exam | 15% |
Computer Homework | 40% |
TOTAL: | 100% |
|
|
|
|
|
|
|
Introduction: An Overview of Machine Learning | 1.1 - 1.3 |
|
|
|
Decision Theory: Bayes Rule | 9.1, 9.2 |
|
|
|
Decision Theory: Gaussian Classifiers | 9.3, 9.4 |
|
|
|
Decision Theory: Generalized Gaussian Classifiers | 9.5 - 9.A |
|
|
|
Parameter Estimation: Maximum Likelihood | 3.1 |
|
|
|
Parameter Estimation: The Bayesian Approach | 9.2 |
|
|
|
Decision Theory: Discriminant Analysis | 10.4 |
|
|
|
Parameter Estimation: The Expectation Maximization Theorem | 10.1 |
|
|
|
Hidden Markov Models: Introduction | Notes |
|
|
|
Hidden Markov Models: Evaluation | Notes |
|
|
|
Hidden Markov Models: Decoding and Dynamic Programming | Notes |
|
|
|
Hidden Markov Models: Parameter Reestimation and Continuous Distributions | Notes |
|
|
|
Exam No. 1 (Lectures 01 - 11) | Notes |
|
|
|
Parameter Estimation: Information Theory Review | Notes |
|
|
|
Parameter Estimation: Discriminative Training | Notes |
|
|
|
Experimental Design: Foundations of Machine Learning | 11.1 - 11.6 |
|
|
|
Experimental Design: Statistical Significance and Confidence | Notes |
|
|
|
Parameter Estimation: Jacknifing, Boostrapping and Combining Classifiers | 4.4 – 4.6, 7.1 - 7.5 |
|
|
|
Parameter Estimation: Nonparametric Techniques | 4.5, 4.6 |
|
|
|
Unsupervised Learning: Clustering | 10.2 |
|
|
|
Unsupervised Learning: Hierarchical Clustering | 10.3 - 10.5 |
|
|
|
Supervised Learning: Decision Trees | 2.3, 7.1 - 7.5 |
|
|
|
Supervised Learning: Support Vector Machines | 8.1 - 8.B |
|
|
|
Neural Networks: Introduction | 6.1, 6.2, 6.A |
|
|
|
Neural Networks: Vanishing Gradients and Regularization | 3.3, 5.3 |
|
|
|
Exams: Exam No. 2 (Lectures 10-24) | Notes |
|
|
|
Neural Networks: Linear and Logistic Regression | 3.1 - 3.A |
|
|
|
Neural Networks: Deep Learning | Notes |
|
|
|
Neural Networks: Alternative Architectures | 11.1 - 11.3 |
|
|
|
Neural Networks: Deep Generative Models and Autoencoders | 10.3 |
|
|
|
Neural Networks: Alternative Supervision Strategies | 6.1, 6.2 |
|
|
|
Neural Networks: Transfer Learning | Notes |
|
|
|
Neural Networks: Alternative Activation Functions and Optimizers | 5.4 - 5.6 |
|
|
|
Attention and Transformers | Notes |
|
|
|
More About Transformers | 10.3 - 10.4 |
|
|
|
Transformer Architectures | Notes |
|
|
|
Expainability in AI | Notes |
|
|
|
Trustworthiness in AI | Notes |
|
|
|
Exam No. 3 (Lectures 25 - 37) | Notes |
|
|
|
Applications: Human Language Technology (Sequential Decoding) | Notes |
|
|
|
Applications: Large Language Models and the ChatGPT Revolution | Notes |
|
|
|
Applications: Sampling Techniques and Quantum Computing | Notes |
|
|
|
Final Exam (08:00 AM - 10:00 AM) | N/A |
|
|
|
|
|
|
Gaussian Distributions |
|
|
Bayesian Decision Theory |
|
|
ML and Bayesian Parameter Estimation |
|
|
Gaussian Mixture Distribution Parameter Estimation |
|
|
Dynamic Programming |
|
|
Hidden Markov Models (HHMs) |
|
|
Information Theory and Statistical Significance |
|
|
LDA, K-Nearest Neighbors and K-Means Clustering |
|
|
Bootstrapping, Bagging and Combining Classifiers |
|
|
Nonparametric Classifiers, ROC Curves and AUC |
|
|
Multilayer Perceptrons |
|
|
Convolutional Neural Networks |
|
|
Transformers |