- DATA 301 (3)
**Introduction to Data Analytics** - Techniques for computation, analysis, and visualization of data using software. Manipulation of small and large data sets. Automation using scripting. Real-world applications from life sciences, physical sciences, economics, engineering, or psychology. No prior computing background is required. Credit will be granted for only one of COSC 301, DATA 301 or DATA 501. [3-2-0]

*Prerequisite:*Either (a) third-year standing, or (b) one of COSC 111 or COSC 122

*Equivalency:*COSC 301. - DATA 311 (3)
**Machine Learning** - Regression, classification, resampling, model selection and validation, fundamental properties of matrices, dimension reduction, tree-based methods, unsupervised learning. [3-2-0]

*Prerequisite:*both of (one of STAT 230 or 75% in either APSC 254, BIOL 202, or PSYO 373) and (one of COSC 111 or APSC 177).

- DATA 405 (3)
**Modelling and Simulation** - Numeric dynamic systems models and emphasis on discrete stochastic systems. State description of models, common model components, entities. Common simulation language. Simulation using algebraic languages. Simulation methodology: data collection, model design, output analysis, optimization, validation. Elements of queuing theory, relationship to simulation. Applications tocomputer systems models. Credit will be granted for only one of COSC 405, DATA 405, COSC 505, or DATA 505. [3-2-0]

*Prerequisite:*A score of 60% or higher in COSC 221 and a score of 60% or higher in COSC 222.

*Equivalency:*COSC 405. - DATA 407 (3)
**Sampling and Design** - Planning and practice of data collection. Pros and cons of both observational and experimental data. Survey samples: random sampling; bias and variance; unequal probability sampling; systematic, multistage, and stratified sampling; ratio and regression estimators. Experimental design: simple one-way comparisons; designs with randomization restrictions including blocking, split-plots, nested and repeated measures designs. Credit will be granted for only one of DATA 407 or STAT 507. [3-1-0]

*Prerequisite:*One of STAT 230, PSYO 372, BIOL 202, ECON 327.

- DATA 410 (3)
**Regression and Generalized Linear Models** - Regression, linear models, generalized linear models, additive models, generalized additive models, mixed models. Theory and numerical performance. Credit will be granted for only one of DATA 410 or STAT 538. [3-2-0]

*Prerequisite:*DATA 311.

- DATA 419 (3-9) d
**Topics in Data Science** - Advanced or specialized topics in data science. Consult the department for the specific topic to be offered in any given year. This course may be taken more than once for credit with different topics. [3-2-0]

*Prerequisite:*Fourth-year standing.

- DATA 421 (3)
**Network Science** - Graphs and complex networks in scientific research. Probabilistic and statistical models. Structures, patterns, and behaviors in networks. Algorithmic and statistical methods. (online/mobile) social networks and social media platforms. Social influence, information diffusion, and viral marketing. Sentiment analysis and opinion mining. Data privacy. Search engines and recommendation systems. Credit will be granted for only one of COSC 421, DATA 421 or DATA 521. [3-2-0]

*Prerequisite:*Third-year standing.

*Equivalency:*COSC 421. - DATA 448 (3/6) d
**Directed Studies in Data Science** - Investigation of a specific topic as agreed upon by the student and the faculty supervisor. Completion of a project and an oral presentation are required.

*Prerequisite:*Third-year standing in the Data Science major or Honours, and permission of the department head.

- DATA 449 (6)
**Honours Thesis** - Students will undertake a research project as agreed upon by the student, supervising faculty member, and unit head. A written thesis and a public presentation (poster or seminar) are required. Restricted to students in the B.Sc. Data Science Honours Program.

*Prerequisite:*Fourth-year standing and permission of the department head.

- DATA 500 (3)
**Communication and Consulting in Data Science** - Effective consulting practices, ethical considerations, methodology selection, data preparation, effective software development. Credit will be granted for only one of DATA 500 or STAT 400 when the subject matter is of the same nature.

- DATA 501 (3)
**Data Analytics** - Techniques for computation, analysis, and visualization of data using software. Manipulation of small and large data sets. Automation using scripting. Real-world applications from life sciences, physical sciences, engineering, or psychology. Credit will be granted for only one of COSC 301, DATA 301 or DATA 501.

- DATA 505 (3)
**Modelling and Simulation** - Simulation methodology: data collection, model design, output analysis, optimization, validation. Credit will be granted for only one of COSC 405, DATA 405, COSC 505, or DATA 505.

- DATA 521 (3)
**Network Science** - Graphs and complex networks in scientific research. Probabilistic and statistical models. Structures, patterns, and behaviors in networks. Algorithmic and statistical methods. (online/mobile) social networks and social media platforms. Social influence, information diffusion, and viral marketing. Sentiment analysis and opinion mining. Data privacy. Search engines and recommendation systems. Credit will be granted for only one of COSC 421, DATA 421 or DATA 521.

- DATA 530 (1)
**Computing Platforms for Data Science** - Introduction to software and tools for Data Science. Setup process. Credit will be granted for only one of DATA 301 or DATA 530. Restricted to students in the MDS program.

- DATA 531 (1)
**Programming for Data Science** - Programming including decisions, loops, functions, and using data structures and libraries. Credit will be granted for only one of DATA 301 or DATA 531. Restricted to students in the MDS program.

- DATA 532 (1)
**Algorithms and Data Structure** - Data structures including lists, queues, stacks, hash tables, trees and graphs. Recursion. Searching and sorting. Asymptotic complexity. Restricted to students in the MDS program.

- DATA 533 (1)
**Collaborative Software Development** - Software life cycle. Licensing. Packaging. Testing and quality control. Version control. Collaborative environments.

*Prerequisite:*DATA 532.

- DATA 534 (1)
**Web and Cloud Computing** - Parallel and cloud computing architectures and program deployment. Restricted to students in the MDS program.

- DATA 540 (1)
**Databases and Data Retrieval** - Using and querying relational and NoSQL databases for analysis. Experience with SQL, JSON, and programs that use databases. Restricted to students in the MDS program.

*Prerequisite:*DATA 531.

- DATA 541 (1)
**Scripting and Reporting** - Scripting engines for data science. Reporting tools. Automation. Restricted to students in the MDS program.

- DATA 542 (1)
**Data Wrangling** - Manipulation of data using software tools. Data conversion, filtering, sorting, grouping, cleaning, parsing. Automation. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 532, DATA 540, DATA 541.

- DATA 543 (1)
**Data Collection** - Fundamental techniques in the collection of data. Focus will be devoted to understanding the effects of randomization, restrictions on randomization, repeated measures and blocking on the model fitting. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 540, DATA 570.

- DATA 550 (1)
**Data Collection** - Data visualization to produce graphs and images. Advanced data analysis on spreadsheets. Credit will be granted for only one of DATA 301 or DATA 550. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 530, DATA 531.

- DATA 551 (1)
**Dataviz II** - Data visualization using business intelligence and data analysis software. Interactive visualization. Production of visualizations for mobile and web. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 534, DATA 543, DATA 550.

- DATA 552 (1)
**Communication and Argumentation** - Interpretation of data. Argumentation: hypothesis, claim, evidence and inference. Model limitations: bias, validity, reliability, sensitive analysis. Communication of recommendations to decision-makers. Restricted to students in the MDS program.

- DATA 553 (1)
**Privacy, Security and Professional Ethics** - Data privacy laws and expectations. Freedom of information. Ethics board. Licensing. Data security. Restricted to students in the MDS program.

- DATA 570 (1)
**Predictive Modelling** - Introduction to regression for Data Science. Simple linear regression, multiple linear regression, interactions, mixed variable types, model assessment, simple variable selection, k-nearest-neighbours regression. Credit will be granted for only one of DATA 311 or DATA 570. Restricted to students in the MDS program.

- DATA 571 (1)
**Resampling and Regularization** - Resampling techniques and regularization for linear models. Bootstrap, jackknife, cross-validation, ridge regression, lasso, discussion of tuning parameters. Credit will be granted for only one of DATA 311 or DATA 571. Restricted to students in the MDS program.

*Prerequisite:*DATA 570.

- DATA 572 (1)
**Supervised Learning** - Analysis of data with categorical responses. Logistic regression, k-nearest-neighbours classification, discriminant analysis, decision trees and random forests. Credit will be granted for only one of DATA 311 or DATA 572. Restricted to students in the MDS program.

*Prerequisite:*DATA 571.

- DATA 573 (1)
**Unsupervised and Semi-supervised Learning** - Analyses for data with unknown responses. Distance measures, hierarchical clustering, k-means, mixture models. Restrict

*Prerequisite:*DATA 572.

- DATA 580 (1)
**Modelling and Simulation I** - Pseudorandom number generation, testing and transformation to other discrete and continuous data types. Introduction to Poisson processes and the simulation of data from predictive models, as well as temporal and spatial models. Credit will be granted for only one of DATA 405 or DATA 583. Restricted to students in the MDS program.

*Prerequisite:*Either (a) all of STAT 230, DATA 301 or (b) all of DATA 570, DATA 530.

- DATA 581 (1)
**Modelling and Simulation II** - Markov chains and their applications, for example, queueing and Markov Chain Monte Carlo. Credit will be granted for only one of DATA 405 or DATA 581. Restricted to students in the MDS program.

*Prerequisite:*DATA 580.

- DATA 582 (1)
**Bayesian Inference** - Introduction to Bayesian paradigm and tools for Data Science. Topics include Bayes theorem, prior, likelihood and posterior. A detailed analysis of the cases of binomial, normal samples, normal linear regression models. A significant focus will be on computational aspects of Bayesian problems using software packages. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 572, DATA 581.

- DATA 583 (1)
**Advanced Predictive Modelling** - Multiple linear regressions. Splines. Smoothing. Generalized additive models. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 572, DATA 581.

- DATA 585 (1)
**Optimization** - Modeling using mathematical programming. Fundamental continuous and discrete optimization algorithms. Optimization software for small to medium scale problems. Optimization algorithms for data science. Restricted to students in the MDS program.

*Prerequisite:*DATA 580.

- DATA 586 (1)
**Advanced Machine Learning** - Neural networks, backpropagation, deep learning. Restricted to students in the MDS program.

*Prerequisite:*DATA 580.

- DATA 589 (1)
**Special Topic** - Advanced or specialized topic in Data Science with applications to specific data sets. Restricted to students in the MDS program.

*Prerequisite:*DATA 543.

- DATA 599 (1)
**Capstone** - A capstone design project designed to give students experience in performing data science on a complex multidisciplinary project. Restricted to students in the MDS program.

*Prerequisite:*All of DATA 583, DATA 586.