Data Mining M.S.

Program Rationale:

  • The Master of Science in Data Mining prepares students to find interesting and useful patterns and trends in large data sets.
  • Students are provided with expertise in state-of-the-art data modeling methodologies to prepare them for information-age careers.

Learning Outcomes for Program Graduates:

Students in the program will be expected to:

  • approach data mining as a process, by demonstrating competency in the use of CRISP-DM (the Cross-Industry Standard Process for Data Mining), including the business understanding phase, the data understanding phase, the exploratory data analysis phase, the modeling phase, the evaluation phase, and the deployment phase;
  • be proficient with leading data mining software, including WEKA, Clementine by SPSS, and the R language;
  • understand and apply a wide range of clustering, estimation, prediction, and classification algorithms, including k-means clustering, BIRCH clustering, Kohonen clustering, classification and regression trees, the C4.5 algorithm, logistic Regression, k-nearest neighbor, multiple regression, and neural networks; and
  • understand and apply the most current data mining techniques and applications, such as text mining, mining genomics data, and other current issues.

Program Prerequisites

 Applicants to the Master of Science in Data Mining program are expected to have completed two semesters of applied statistics (such as STAT 104/STAT 453, STAT 200/STAT 201, or STAT 215/STAT 216) with grades B or better, or two semesters of statistics approved by advisor with grades B or better, or permission of the Data Mining Program Director. The second semester course may be taken concurrently with STAT 521 Into to Data Mining.

Admission Requirements

Students must hold a Bachelor's degree from a regionally accredited institution of higher education. The undergraduate record must demonstrate clear evidence of ability to undertake and pursue studies successfully in a graduate field.

A minimum undergraduate GPA of 3.00 on a 4.00 scale (where A is 4.00), or is equivalent, and good standing (3.00 GPA) in all post-baccalaureate course work is required. Conditional admission may be granted to candidates with undergraduate GPAs as low as 2.40, conditioned on a student receiving no grades lower than a B in the first three core courses in the program.

In addition to the materials required by the School of Graduate Studies, the following are required by the program:

  • A formal application essay of 500-1000 words that focuses on (a) academic and work history, (b) reasons for pursuing the Master of Science in Data Mining, (c) future professional aspirations, and (d) where and how the applicant has completed the program prerequisites. The essay will also be used to demonstrate a command of the English language.
  • Two letters of recommendation, one from each the academic and work environment (or two from academia if the candidate has not been employed).

The application to the Data Mining program is filled out online. All transcripts should be sent to the Graduate Admissions Office. The formal application essay, the prerequisite letter, and the two letters of recommendation, can be emailed to the Director of the Data Mining program or physically mailed to:

Director of the Data Mining Program

Re: MS in Data Mining Admissions Materials

Department of Mathematical Sciences

Marcus White 128

Central Connecticut State University

New Britain, CT, 06050

Course and Capstone Requirements

Core Courses

The following courses are required of all students.

STAT 520Multivariate Analysis for Data Mining


STAT 521Introduction to Data Mining


STAT 522Clustering and Affinity Analysis


STAT 523Predictive Analytics


STAT 526Data Mining for Genomics and Proteomics


STAT 527Text Mining


STAT 599Thesis


Total Credit Hours:27

Elective Courses

Choose any two courses from the following list:

CS 570Topics in Artificial Intelligence


CS 580Topics in Database Systems and Applications


STAT 455Experimental Design


STAT 456/MKT 444Fundamentals of SAS


STAT 465Nonparametric Statistics


STAT 525Web Mining


STAT 529Current Issues in Data Mining


STAT 551Applied Stochastic Processes


Total Credit Hours:6

Other appropriate graduate course, with permission of advisor

Total Credit Hours: 33

Total Credit Hours: 33