last updated: February 12, 2026


COSC 6335: Data Mining in Fall 2024 (Dr. Eick )



2024 COSC 6335 Syllabus

Goals of the Data Mining Course

Data mining centers on finding novel, interesting, valid, and potentially useful patterns in data. It aims at transforming a large amount of data into a well of knowledge. Data mining has become a very important field in industry as well as academia. The course covers most of the important data mining techniques, covers the Basics of Data Science, and provides background knowledge on how to conduct a data mining project. Topics covered in the course include exploratory data analysis, classification and prediction, clustering and similarity assessment, association analysis, outlier and anomaly detection, and interpreting and evaluating data analysis/data mining results. Also basic visualization techniques and statistical methods will be introduced. Moreover, hands on data mining experience will be provided in three assignments. You will also get some practical expierence in evaluating data mining results from you fellow students and data mining publications. Finally, you will learn on how to use and do programming in the popular statistics, visualization, and data mining environment R. The topics of the course have some overlap with what is taught in the Machine Learning (COSC 6342) course, to reduce this overlap the teaching of this course places a little less emphasis on learning classification and prediction models (this topic will be covered "more quickly" and more emphasis will put on Data Science Basics, Exploratory Data Analysis, Association Analysis, Clustering, Outlier Detection and Data Set Augmentation. There will be 3 assignments and 2 "paper walkthoughs" in 2026. There will be 3 group activities in 2026: Assignment2, leading paper walkthoughts, and Group Homework Credit (you find a more detailed description about group activities below).

Comments concerning this website

If you have any comments concerning this website, send e-mail to: ceick@uh.edu

Basic Course Information

Instructor: Dr. Christoph F. Eick
office hours: TU 4-5p TH 9-10a
e-mail: ceick@uh.edu
TA: Janet Anagli
e-mail: jyanagli@CougarNet.UH.EDU
TA office hours: MO+TU 9-10a (on MS Teams)
class meets: TU/TH 2:30-4p
class room: SEC 203
classes taught by others: Likely, February 19
cancelled classes: Likely, March 12

Course Materials

Objectives Data Mining Course

Highly Recommended Text:
P.-N. Tang, M. Steinback, and V. Kumar: Introduction to Data Mining,
Addison Wesley, Second Edition, 2019
Link to Book HomePage

Recommended Texts: NIST/SEMATECH e-Handbook of Statistical Methods (good onlne source covering exploratory data analysis, statistics, modelling and prediction)

2026 Course Organization

I.	Introduction to Data Mining
II. 	Density Estimation 
III.    Similarity Assessment 
IV.	Outlier and Anomaly Detection
V.      Autoencoders 
VI.	Data Science Basics and Exploratory Data Analysis 
VII.      Introduction to Clustering 
VIII.     Classification: Basic Concepts, Decision Trees and Neural Networks.
IX.	Association Analysis: Rule, Sequence, Graph and Collocation Mining
X.     Spatial Data Minng
XI.      Data Augmentation 
XII.    Data Storytelling 
XIII.   Advanced Clustering 
XIV.    Preventing False Discoveries (Optional Topic)
XV.    Preprocessing 
  

Important Course Dates

Thursday, February 5: Assignment1 Lab
Thursday, March 5, 11:30a: First Course Exam
Thursday, March 12: Likely, no lecture
March 17+19: no lecture (Spring Break)
Tu., March 10: First Paper Walkthrough
Th., April 16: Second Paper Walkthrough
Th+Tu, April 2+7: Task2 Student Presentations
Thursday, April 30, 11:30a: Second Course Exam

News COSC 6335 (Data Mining) Fall 2026