Skip to main content

Data Science Foundations

Courses

Core Courses (students must select at least two):

Units: 3

Algorithm design techniques: use of data structures, divide and conquer, dynamic programming, greedy techniques, local and global search. Complexity and analysis of algorithms: asymptotic analysis, worst case and average case, recurrences, lower bounds, NP-completeness. Algorithms for classical problems including sorting, searching and graph problems [connectivity, shortest paths, minimum spanning trees].

Offered in Fall Spring Summer

Units: 3

Complex and specialized data structures relevant to design and development of effective and efficient software. Hardware characteristics of storage media. Primary file organizations. Hashing functions and collision resolution techniques. Low level and bit level structures including signatures, superimposed coding, disjoint coding and Bloom filters. Tree and related structures including AVL trees, B*trees, tries and dynamic hashing techniques.

Offered in Spring Only

Units: 3

This course will introduce common statistical learning methods for supervised and unsupervised predictive learning in both the regression and classification settings. Topics covered will include linear and polynomial regression, logistic regression and discriminant analysis, cross-validation and the bootstrap, model selection and regularization methods, splines and generalized additive models, principal components, hierarchical clustering, nearest neighbor, kernel, and tree-based methods, ensemble methods, boosting, and support-vector machines.

Offered in Summer

Elective Courses (students must select at least one):

Units: 3

This course provides an introduction to concepts and methods for extracting knowledge or other useful forms of information from data. This activity, also known under names including data mining, knowledge discovery, and exploratory data analysis, plays an important role in modern science, engineering, medicine, business, and government. Students will apply supervised and unsupervised automated learning methods to extract patterns, make predictions and identify groups from data. Students will also learn about the overall process of data collection and analysis that provides the setting for knowledge discovery, and concomitant issues of privacy and security. Examples and projects introduce the students to application areas including electronic commerce, information security, biology, and medicine. Students cannot get credit for both CSC 422 and CSC 522.

Offered in Fall and Spring

Units: 3

Advanced database concepts. Logical organization of databases: the entity-relationship model; the relational data model and its languages. Functional dependencies and normal forms. Design, implementation, and optimization of query languages; security and integrity, consurrency control, transaction processing, and distributed database systems.

Offered in Fall and Spring

Units: 1 - 6

Topics of current interest in computer science not covered in existing courses.

Offered in Fall and Spring

Note: These special topics sections of CSC 591 may be used as electives:

  • Data Driven Business Intelligence - 3 credits
  • Graph Data Mining - 3 credits
  • Spatial and Temporal Data Mining - 3 credits

Units: 3

Introduction to Bayesian concepts of statistical inference; Bayesian learning; Markov chain Monte Carlo methods using existing software [SAS and OpenBUGS]; linear and hierarchical models; model selection and diagnostics.

Offered in Spring Only