- Lecturer: Nazem Khan
General prerequisites:
The course assumes a good undergraduate level understanding of statistics, especially basic characteristics of key univariate distributions, statistical estimators (Least-Square, Method of Moments, Maximum Likelihood Estimators), confidence interval, hypothesis testing, normality tests, F-Tests. Students should also familiarized themselves with Python as a programming language, aiming to understand plotting techniques to apply Exploratory Data Analysis.
Course term: Michaelmas
Course lecture information: 6 hours of lectures week -1.
Course overview:
This introductory course provides a practical foundation in statistical learning, with a strong emphasis on data analysis and modelling using real-world financial data. Students will begin with exploratory data analysis and visualization, gaining familiarity with financial datasets and learning essential data transformation techniques.
The course then introduces the concept of estimators, using the mean and variance as key examples, and explores Maximum Likelihood Estimation (MLE) as a fundamental approach. Building on this, we formalize the statistical learning framework with the model: Y = f(X)+ε and discuss strategies for estimating the function f, contrasting parametric and non-parametric methods.
We then turn to evaluating model performance, introducing the bias-variance tradeoff, overfitting, and other key principles that guide effective model selection. The core parametric method studied is Linear Regression, with implementation and analysis conducted in Python.
The course then introduces the concept of estimators, using the mean and variance as key examples, and explores Maximum Likelihood Estimation (MLE) as a fundamental approach. Building on this, we formalize the statistical learning framework with the model: Y = f(X)+ε and discuss strategies for estimating the function f, contrasting parametric and non-parametric methods.
We then turn to evaluating model performance, introducing the bias-variance tradeoff, overfitting, and other key principles that guide effective model selection. The core parametric method studied is Linear Regression, with implementation and analysis conducted in Python.
Course synopsis:
Data analysis
Data visualization
Data transformation
Estimators
Least squares estimation
Maximum likelihood estimation
Ordinary Least Squares (OLS) estimation
Interpretation of regression outputs
Assumptions underlying the linear model
Statistical inference: significance testing of coefficients and construction of confidence intervals
Model diagnostics to assess goodness-of-fit and detect violations of assumptions
Model selection and variable transformation, including polynomial regression
Data visualization
Data transformation
Estimators
Least squares estimation
Maximum likelihood estimation
Ordinary Least Squares (OLS) estimation
Interpretation of regression outputs
Assumptions underlying the linear model
Statistical inference: significance testing of coefficients and construction of confidence intervals
Model diagnostics to assess goodness-of-fit and detect violations of assumptions
Model selection and variable transformation, including polynomial regression