**Data Science**

This interdisciplinary program focuses on the analysis and handling of data from multiple sources and for various applications in order to draw inferences from it, combining topics from mathematics, statistics, and computer science. These topics include probability theory, inference, least-square estimation, maximum likelihood estimation, finding local and global optimal solutions (gradient descent, genetic algorithms, etc.), and generalized additive models. It also covers machine learning topics such as classification, conditional probability estimation, clustering, and dimensionality reduction (e.g. discriminant factor and principal component analyses), and decision support systems. The program also covers big data analysis, including big data collection, preparation, preprocessing, warehousing, interactive visualization, analysis, scrubbing, mining, management, modeling, and tools such as Hadoop, Map-Reduce, Apache Spark, etc.

Hosting Dept. | MATH | Open To | MATH, ICS, ISE |

Courses | MATH 405: Learning from Data STAT 413: Statistical Modeling ISE 487: Predictive Analytics Techniques ICS 475: Applied Big Data | | |

*MATH 405 Learning from Data*

The aim of this course is to provide students with selected topics from linear algebra, statistics, and optimization concepts with an emphasis on their applications in machine learning algorithms like Linear Regression and Neural Networks using numerical software, toolboxes, and libraries. Topics include basic vector and matrix operations, Factorizations, Basic Probability Theory, Inference, Least-Square Estimation, Maximum Likelihood Estimation, and Gradient Descent.

**Prerequisite:** MATH 102 or MATH 106 and STAT 201 or 212, or 319 or ISE 205, and ICS 103

*STAT 413: Statistical Modeling*

Statistical tools for learning from the data by doing statistical analysis on the data with an emphasis on the implementation using various software, toolboxes, and libraries like R, Scikit-Learn, and Statsmodels. Topics include Simple and Multiple Linear Regression, Polynomial Regression, Splines; Generalized Additive Models; Hierarchical and Mixed Effects Models; Bayesian Modeling; Logistic Regression, Generalized Linear Models, Discriminant Analysis; Model Selection.

**Prerequisite:** MATH 405

*ISE 487: Predictive Analytics Techniques*

Characteristics of time series, trends, seasonality, noise, stationarity; Statistical background and model evaluation methods; Time series regression, variable selection and general linear regression; Exponential Smoothing and seasonal data; ARIMA based models including MA, AR, ARMA, ARIMA and SARIMA, Model validation and parameter estimation; Advance predictive analytics: Multivariate prediction, state space models, neural networks, spectral analysis and Bayesian methods.

**Prerequisite:** MATH 405

*ICS 475: Applied Big Data*

Introduction to Big data, data collection on cloud, Data in IoT, challenge of big data analysis, Virtualization in Cloud Computing Systems, Hypervisors for Creating Native Virtual Machines, Amazon AWS Cloud, AWS Lambda, Google App Engine and Microsoft Azure, Intelligent Machines and Deep Learning Networks, Introduction to basic ML learning tools on cloud computing, Machine learning algorithms on Hadoop and Spark.

**Prerequisite:** MATH 101, STAT 319 or equivalent