DEEP LEARNING
Academic Year 2024/2025 - Teacher: SIMONE PALAZZOExpected Learning Outcomes
Knowledge and Understanding
This course provides knowledge of Machine Learning and Deep Learning techniques and algorithms, with a particular focus on regression models, classification, and unsupervised learning. Students will learn how to evaluate model performance through error metrics and validation techniques, addressing issues such as overfitting and the bias-variance tradeoff. The course also explores regularization methods, ensemble algorithms like bagging and boosting, and then focuses on neural networks and convolutional neural networks (CNN). Brief overviews of Transformers and approaches to explainability and interpretability offer insights into the most advanced technologies.
Applied Knowledge and Understanding
The course includes practical examples and exercises that will allow students to apply Deep Learning methods to real-world problems, using software tools commonly used in the industry such as scikit-learn and PyTorch. Students will learn to design and train Deep Learning models, manage data loading and preprocessing, and validate their performance using standard metrics.
Making Judgments
Students will develop the ability to evaluate the performance of Deep Learning models, identify and mitigate overfitting and bias issues, and choose between different models and optimization techniques based on the problem context.
Communication Skills
Students will acquire the ability to effectively communicate the results of their analyses and models, both in written and oral form. They will be able to present the outcomes of their research and projects to both technical and non-technical audiences, using appropriate language and supporting their arguments with relevant data and visualizations.
Learning Skills
The course will encourage students to develop a critical and autonomous approach to learning, fostering curiosity about new technologies and trends in the field of Machine Learning and Deep Learning. Students will be able to continue learning independently, using academic and professional resources to stay updated on the latest innovations and techniques in the sector.
This course provides knowledge of Machine Learning and Deep Learning techniques and algorithms, with a particular focus on regression models, classification, and unsupervised learning. Students will learn how to evaluate model performance through error metrics and validation techniques, addressing issues such as overfitting and the bias-variance tradeoff. The course also explores regularization methods, ensemble algorithms like bagging and boosting, and then focuses on neural networks and convolutional neural networks (CNN). Brief overviews of Transformers and approaches to explainability and interpretability offer insights into the most advanced technologies.
Applied Knowledge and Understanding
The course includes practical examples and exercises that will allow students to apply Deep Learning methods to real-world problems, using software tools commonly used in the industry such as scikit-learn and PyTorch. Students will learn to design and train Deep Learning models, manage data loading and preprocessing, and validate their performance using standard metrics.
Making Judgments
Students will develop the ability to evaluate the performance of Deep Learning models, identify and mitigate overfitting and bias issues, and choose between different models and optimization techniques based on the problem context.
Communication Skills
Students will acquire the ability to effectively communicate the results of their analyses and models, both in written and oral form. They will be able to present the outcomes of their research and projects to both technical and non-technical audiences, using appropriate language and supporting their arguments with relevant data and visualizations.
Learning Skills
The course will encourage students to develop a critical and autonomous approach to learning, fostering curiosity about new technologies and trends in the field of Machine Learning and Deep Learning. Students will be able to continue learning independently, using academic and professional resources to stay updated on the latest innovations and techniques in the sector.
Course Structure
Lectures, classroom exercises
Required Prerequisites
A preliminary knowledge of programming and the fundamentals of linear algebra and mathematical analysis is required.
Attendance of Lessons
Strongly suggested.
Detailed Course Content
1. Basic Concepts
1.1. Hypotheses and Models
1.2. Bias-Variance Trade-Off
1.3. Learning Modes
1.4. Applications of Machine Learning
2. Linear Regression and Optimization
2.1. Basic Concepts of Linear Regression
2.2. Least Squares Method
2.3. Analytical Optimization and Gradient Descent
3. Performance Evaluation
3.1. Evaluation Metrics: Accuracy, Precision, Recall, F1-Score
3.2. ROC Curve and AUC
3.3. Cross-Validation
4. Regularization
4.1. Principles of Regularization
4.2. Ridge Regression
4.3. LASSO
5. Classifiers
5.1. Logistic Regression
5.2. Support Vector Machine (SVM)
5.3. Kernel Trick
5.4. Non-Parametric Models: k-NN
6. PCA
6.1. Dimensionality Reduction
6.2. PCA: Theory and Applications
7. Unsupervised Learning
7.1. Clustering: K-Means, Hierarchical
7.2. Gaussian Mixture Models
7.3. Evaluation of Clustering Algorithms
8. Decision Trees and Bagging/Boosting
8.1. Decision Trees
8.2. Random Forests
8.3. AdaBoost and Gradient Boosting
9. Neural Networks
9.1. Neural Network Architecture
9.2. Activation Functions
9.3. Backpropagation
9.4. Optimization Algorithms: Mini-Batch Gradient Descent, Gradient Descent with Momentum, Learning Rate Decay
10. Convolutional Neural Networks (CNNs)
10.1. Basic Concepts: Padding, Strided Convolution, Dilation, 2D and 3D Convolution, and Pooling
10.2. Basic CNN Architecture and State-of-the-Art Models
10.3. Training CNNs: Regularization, Dropout, Batch Normalization, Data Augmentation
10.4. Introduction to Explainability and Interpretability
11. Transformer
11.1. Transformer Architecture
11.2. Self-Attention Mechanism
11.3. Applications
12. Python for Machine Learning
12.1. Python Language (Syntax, Data Types, Functions, and Classes)
12.2. Numpy, SciPy, Pandas, Matplotlib
12.3. Scikit-Learn (Classification, Regression, Clustering)
12.4. PyTorch (Neural Networks, CNN)
1.1. Hypotheses and Models
1.2. Bias-Variance Trade-Off
1.3. Learning Modes
1.4. Applications of Machine Learning
2. Linear Regression and Optimization
2.1. Basic Concepts of Linear Regression
2.2. Least Squares Method
2.3. Analytical Optimization and Gradient Descent
3. Performance Evaluation
3.1. Evaluation Metrics: Accuracy, Precision, Recall, F1-Score
3.2. ROC Curve and AUC
3.3. Cross-Validation
4. Regularization
4.1. Principles of Regularization
4.2. Ridge Regression
4.3. LASSO
5. Classifiers
5.1. Logistic Regression
5.2. Support Vector Machine (SVM)
5.3. Kernel Trick
5.4. Non-Parametric Models: k-NN
6. PCA
6.1. Dimensionality Reduction
6.2. PCA: Theory and Applications
7. Unsupervised Learning
7.1. Clustering: K-Means, Hierarchical
7.2. Gaussian Mixture Models
7.3. Evaluation of Clustering Algorithms
8. Decision Trees and Bagging/Boosting
8.1. Decision Trees
8.2. Random Forests
8.3. AdaBoost and Gradient Boosting
9. Neural Networks
9.1. Neural Network Architecture
9.2. Activation Functions
9.3. Backpropagation
9.4. Optimization Algorithms: Mini-Batch Gradient Descent, Gradient Descent with Momentum, Learning Rate Decay
10. Convolutional Neural Networks (CNNs)
10.1. Basic Concepts: Padding, Strided Convolution, Dilation, 2D and 3D Convolution, and Pooling
10.2. Basic CNN Architecture and State-of-the-Art Models
10.3. Training CNNs: Regularization, Dropout, Batch Normalization, Data Augmentation
10.4. Introduction to Explainability and Interpretability
11. Transformer
11.1. Transformer Architecture
11.2. Self-Attention Mechanism
11.3. Applications
12. Python for Machine Learning
12.1. Python Language (Syntax, Data Types, Functions, and Classes)
12.2. Numpy, SciPy, Pandas, Matplotlib
12.3. Scikit-Learn (Classification, Regression, Clustering)
12.4. PyTorch (Neural Networks, CNN)
Textbook Information
Study material provided by the teachers.
Course Planning
Subjects | Text References | |
---|---|---|
1 | Basic concepts | |
2 | Linear regression and optimization | |
3 | Laboratory on linear regression | |
4 | Performance evaluation | |
5 | Laboratory on performance evaluation | |
6 | Regularization | |
7 | Laboratory on regularization | |
8 | Classification | |
9 | Laboratory on classification | |
10 | PCA | |
11 | Laboratory on PCA | |
12 | Unsupervised learning | |
13 | Laboratory on unsupervised learning | |
14 | Decision trees and bagging/boosting | |
15 | Laboratory on decision trees | |
16 | Neural networks | |
17 | Laboratory on neural networks | |
18 | Convolutional neural networks (CNN) | |
19 | Laboratory on CNN | |
20 | Transformers | |
21 | Laboratory on transformers |
Learning Assessment
Learning Assessment Procedures
Project on topics proposed by the teachers, to be completed at home. Oral exam with discussion of the project and theoretical questions
Examples of frequently asked questions and / or exercises
- Explain the bias-variance trade-off and how it influences the generalization ability of a machine learning model.
- Describe the least squares method and how it is used to estimate the parameters of a linear regression model.
- What is the difference between precision and recall? In which scenarios might one be more important than the other?
- Compare Ridge regression and LASSO, specifying when it would be preferable to use one over the other.
- How does the kernel trick extend the capabilities of SVMs to solve non-linearly separable problems?
- How does the PCA algorithm work for dimensionality reduction and what are its limitations?
- Explain how a boosting algorithm like AdaBoost works and what are its advantages over individual decision trees.
- Describe the concept of 2D convolution in CNNs and how it affects feature extraction from images.
- Explain the self-attention mechanism in Transformers and how it contributes to modeling long-range dependencies in sequential data.
VERSIONE IN ITALIANO
- Describe the least squares method and how it is used to estimate the parameters of a linear regression model.
- What is the difference between precision and recall? In which scenarios might one be more important than the other?
- Compare Ridge regression and LASSO, specifying when it would be preferable to use one over the other.
- How does the kernel trick extend the capabilities of SVMs to solve non-linearly separable problems?
- How does the PCA algorithm work for dimensionality reduction and what are its limitations?
- Explain how a boosting algorithm like AdaBoost works and what are its advantages over individual decision trees.
- Describe the concept of 2D convolution in CNNs and how it affects feature extraction from images.
- Explain the self-attention mechanism in Transformers and how it contributes to modeling long-range dependencies in sequential data.