- DATA 101 (3)
**Making Predictions with Data** - Introduction to the techniques and software for handling real-world data. Topics include data cleaning, visualization, simulation, basic modelling, and prediction making. [3-1-0]

- DATA 301 (3)
**Introduction to Data Analytics** - Techniques for computation, analysis, and visualization of data using software. Manipulation of small and large data sets. Automation using scripting. Real-world applications from life sciences, physical sciences, economics, engineering, or psychology. No prior computing background is required. Credit will be granted for only one of COSC 301, DATA 301 or DATA 501. [3-2-0]

*Prerequisite:*Either (a) third-year standing, or (b) one of COSC 111 or COSC 122

*Equivalency:*COSC 301. - DATA 310 (3)
**Applied Regression Analysis** - Theory and application of simple and multiple linear regression models, estimation, inference (confidence intervals, prediction intervals and hypothesis testing), polynomial regression, ANOVA and ANCOVA, variable selection, model adequacy and residual diagnostics. [3-1-0]

*Prerequisite:*STAT 230 and MATH 221.

- DATA 311 (3)
**Machine Learning** - Regression, classification, resampling, model selection and validation, fundamental properties of matrices, dimension reduction, tree-based methods, unsupervised learning. [3-2-0]

*Prerequisite:*Either (a) STAT 230 or (b) a score more than 75% in one of APSC 254, BIOL 202, PSYO 373; and one of COSC 111, APSC 177.

- DATA 315 (3)
**Applied Time Series and Forecasting** - Trends, stationary and nonstationary time series models, forecasting, seasonal models. [3-1-0]

*Prerequisite:*STAT 230.

- DATA 405 (3)
**Stochastic Modelling and Simulation** - Pseudorandom number generation and testing. Simulation and modelling of univariate and multivariate data; stochastic models, including Poisson processes and Markov chains; MCMC simulation, hidden Markov models, and queuing systems. Credit will be granted for only one of COSC 405, DATA 405, COSC 505, or DATA 505. [3-2-0]

*Prerequisite:*A score of 60% or higher in STAT 230.

- DATA 407 (3)
**Sampling and Design** - Planning/practice of data collection. Pros/cons of both observational and experimental data. Survey samples: random sampling; bias and variance; unequal probability sampling; systematic, multistage, and stratified sampling; ratio and regression estimators. Experimental design: simple one-way comparisons; designs with randomization restrictions including blocking, split-plots, nested and repeated measures designs. Credit will be granted for only one of DATA 407 or STAT 507. [3-1-0]

*Prerequisite:*One of STAT 230, PSYO 372, BIOL 202.

- DATA 410 (3)
**Regression and Generalized Linear Models** - Regression, linear models, generalized linear models, additive models, generalized additive models, mixed models, theory and numerical performance. [3-2-0]

*Prerequisite:*All of MATH 221, STAT 303, DATA 311.

- DATA 419 (3-9) d
**Topics in Data Science** - Advanced or specialized topics in data science. Consult the department for the specific topic to be offered in any given year. This course may be taken more than once for credit with different topics. [3-2-0]

*Prerequisite:*Fourth-year standing.

- DATA 448 (3/6) d
**Directed Studies in Data Science** - Investigation of a specific topic as agreed upon by the student and the faculty supervisor. Completion of a project and an oral presentation are required.

*Prerequisite:*Third-year standing in the Data Science major or Honours, and permission of the department head.

- DATA 449 (6)
**Honours Thesis** - Students will undertake a research project as agreed upon by the student, supervising faculty member, and unit head. A written thesis and a public presentation (poster or seminar) are required. Restricted to students in the B.Sc. Data Science Honours Program.

*Prerequisite:*Fourth-year standing and permission of the department head.

- DATA 500 (3)
**Communication and Consulting in Data Science** - Effective consulting practices, ethical considerations, methodology selection, data preparation, effective software development. Credit will be granted for only one of DATA 500 or STAT 400 when the subject matter is of the same nature.

- DATA 501 (3)
**Data Analytics** - Techniques for computation, analysis, and visualization of data using software. Manipulation of small and large data sets. Automation using scripting. Real-world applications from life sciences, physical sciences, engineering, or psychology. Credit will be granted for only one of COSC 301, DATA 301 or DATA 501.

- DATA 505 (3)
**Modelling and Simulation** - Simulation methodology: data collection, model design, output analysis, optimization, validation. Credit will be granted for only one of COSC 405, DATA 405, COSC 505, or DATA 505.

- DATA 530 (1)
**Computing Platforms for Data Science** - Introduction to software and tools for Data Science. Setup process. Restricted to students in the MDS program.

- DATA 531 (1)
**Programming for Data Science** - Programming including decisions, loops, functions, and using data structures and libraries. Restricted to students in the MDS program.

- DATA 532 (1)
**Algorithms and Data Structure** - Data structures including lists, queues, stacks, hash tables, trees and graphs. Recursion. Searching and sorting. Asymptotic complexity. Restricted to students in the MDS program.

- DATA 533 (1)
**Collaborative Software Development** - Software life cycle. Licensing. Packaging. Testing and quality control. Version control. Collaborative environments. Restricted to students in the MDS program.

*Prerequisite:*DATA 532.

- DATA 534 (1)
**Web and Cloud Computing** - Parallel and cloud computing architectures and program deployment. Restricted to students in the MDS program.

- DATA 540 (1)
**Databases and Data Retrieval** - Using and querying relational and NoSQL databases for analysis. Experience with SQL, JSON, and programs that use databases. Restricted to students in the MDS program.

*Prerequisite:*DATA 531.

- DATA 541 (1)
**Scripting and Reporting** - Scripting engines for data science. Reporting tools. Automation. Restricted to students in the MDS program.

- DATA 542 (1)
**Data Wrangling** - Manipulation of data using software tools. Data conversion, filtering, sorting, grouping, cleaning, parsing. Automation. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 532, DATA 540, DATA 541.

- DATA 543 (1)
**Data Collection** - Fundamental techniques in the collection of data. Focus will be devoted to understanding the effects of randomization, restrictions on randomization, repeated measures and blocking on the model fitting. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 540, DATA 570.

- DATA 550 (1)
**Dataviz I** - Data visualization to produce graphs and images. Advanced data analysis on spreadsheets. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 530, DATA 531.

- DATA 551 (1)
**Dataviz II** - Data visualization using business intelligence and data analysis software. Interactive visualization. Production of visualizations for mobile and web. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 534, DATA 543, DATA 550.

- DATA 552 (1)
**Communication and Argumentation** - Interpretation of data. Argumentation: hypothesis, claim, evidence and inference. Model limitations: bias, validity, reliability, sensitive analysis. Communication of recommendations to decision-makers. Restricted to students in the MDS program.

- DATA 553 (1)
**Privacy, Security and Professional Ethics** - Data privacy laws and expectations. Freedom of information. Ethics board. Licensing. Data security. Restricted to students in the MDS program.

- DATA 570 (1)
**Predictive Modelling** - Introduction to regression for Data Science. Simple linear regression, multiple linear regression, interactions, mixed variable types, model assessment, simple variable selection, k-nearest-neighbours regression. Restricted to students in the MDS program.

*Prerequisite:*DATA 580.

- DATA 571 (1)
**Resampling and Regularization** - Resampling techniques and regularization for linear models. Bootstrap, jackknife, cross-validation, ridge regression, lasso, discussion of tuning parameters. Restricted to students in the MDS program.

*Prerequisite:*DATA 570.

- DATA 572 (1)
**Supervised Learning** - Analysis of data with categorical responses. Logistic regression, k-nearest-neighbours classification, discriminant analysis, decision trees and random forests. Restricted to students in the MDS program.

*Prerequisite:*DATA 571.

- DATA 573 (1)
**Unsupervised and Semi-supervised Learning** - Analyses for data with unknown responses. Distance measures, hierarchical clustering, k-means, mixture models. Restricted to students in the MDS program.

*Prerequisite:*DATA 572.

- DATA 580 (1)
**Modelling and Simulation I** - Pseudorandom number generation, testing and transformation to other discrete and continuous data types. Introduction to Poisson processes and the simulation of data from predictive models, as well as temporal and spatial models. Restricted to students in the MDS program.

- DATA 581 (1)
**Modelling and Simulation II** - Markov chains and their applications, for example, queueing and Markov Chain Monte Carlo. Restricted to students in the MDS program.

*Prerequisite:*DATA 580.

- DATA 582 (1)
**Bayesian Inference** - Introduction to Bayesian paradigm and tools for Data Science. Topics include Bayes theorem, prior, likelihood and posterior. A detailed analysis of the cases of binomial, normal samples, normal linear regression models. A significant focus will be on computational aspects of Bayesian problems using software packages. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 572, DATA 581.

- DATA 583 (1)
**Advanced Predictive Modelling** - Splines. Smoothing. Generalized linear models. Generalized additive models. An introduction to mixed models. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 572, DATA 581.

- DATA 585 (1)
**Optimization** - Modelling using mathematical programming. Fundamental continuous and discrete optimization algorithms. Optimization software for small to medium scale problems. Optimization algorithms for data science. Restricted to students in the MDS program.

*Prerequisite:*DATA 580.

- DATA 586 (1)
**Advanced Machine Learning** - Neural networks, backpropagation, deep learning. Restricted to students in the MDS program.

*Prerequisite:*DATA 580.

- DATA 589 (1)
**Special Topic** - Advanced or specialized topic in Data Science with applications to specific data sets. Restricted to students in the MDS program.

*Prerequisite:*DATA 543.

- DATA 599 (6)
**Capstone** - A capstone design project designed to give students experience in performing data science on a complex multi-disciplinary project. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 583, DATA 586.