The Home Page of the
Graduate Database Course COSC 6340
Spring 2004

Basic Course Information COSC 6340

Instructor: Dr. Christoph F. Eick
Office: 501E PGH
Office hours: MO 3:30-5p TU 1-2p 3:30-5p TH 9:30-11a
 

Teaching Assistant: Zhenghong Zhao
Office: 518 PGH
Office hours: MO/TH 4:30pm-6:00pm
e-mail: zhenzhao@cs.uh.edu


Class meets: TU/TH 11:30am-1:00pm
Class room: 232 PGH
cancelled classes: none

Course Materials

Required Text:
Raghu Ramakrishnan and Johannes Gehrke, Data Management Systems,
McGraw Hill, Third Edition, 2002.
Call number:
Link to Textbook Homepage
Recommended:
Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques
Morgan Kaufman Publishers, 2001, ISBN 1-55860-489-8
Link to Data Mining Book Home Page
Other books with relevant material:
Ramez Elmasri and Shamkant Navathe, Fundamentals of Database Systems, Third Edition
Addison Wesley ISBN: 0-8053-1755-4
Link to Navathe's Book/Course Page

News COSC 6340 Spring 2004

  • I enjoyed teaching the course and like to wish all students a happy Summer 2004!.
  • There have been some delays concerning the grading of the PhD Qualifying Exam; I expect that the results of the QE will be available on June 4, 2004.
  • The letter grades have been posted (see Zhenghong's webpage). A more detailed grade report will be posted no later than May 19, 2004. A students course score was determined using the following weights: M1(16%), M2(25%), Final Exam(31%), all other things (28%; 2 homeworks, 2 programming projects, group project; we also decided to give different presentation grades to students belonging to the same group).
  • The final exam will not be returned to students. However, you can see your final exam on the following dates: We., May 26, 12:50p-2p, or Th., August 19, 11:00a-12:30p, or Tu., August 31, 3p-4:30p.
  • I expect that the grades for the 2004 Data Management PhD Qualifying Exam will be available on June 2, 2004. An e-mail will be sent to all students that took the 2004 Data Management QE, as soon as the results are known!
  • Solution sketches for Homework2 (problems 6 and 9) are available now.
  • Review Sheet Final Exam COSC 6340 on May 11, 11a
  • Review Sheet Part2 Data Management QE on May 14, 10a.
  • The list of papers for the 2004 Data Management Qualifying Exam has been finalized! The four papers can be downloaded from the following directory: http://www2.cs.uh.edu/~ceick/6340/QE2004/

    Prerequisites

    The class prerequisites are COSC 3480 or equivalent (undergraduate database class) and MATH 3336. The first 4 weeks of the COSC 3480-lectures will review some undergraduate material, but the review will be quite fast pace and far from being complete. I decided not to drop the students that did not take COSC 3480 from the class, but rather to give the members of this group the opportunity to acquire the missing knowledge in the first 4 weeks of the semester. If you have neither taken an undergraduate database class, nor acquired the necessary knowledge in a different setting, you will have to work quite hard in the first 4 weeks of the semester.

    Material Covered in COSC 6340

    Students are assumed to be familiar with most of the material that is covered in chapters 1-5, 7-11 and 18 of the required textbook that discuss material that is typically covered in undergraduate database courses, such as COSC 3480 in our undergraduate program. If you took a database class a long time ago, studying those chapters prior to January 31 is highly recommended.

    The course is subdivided into six parts (chapter numbers refer to chapters in the second edition of the textbook).

    Spring 2004 Projects and Other Activities for COSC 6340

    Useful Information on how to install and use ORACLE
    Lab1 (using SQL)
    Lab2 (using PL/SQL)
    Research Review Group Project
    Assignment1
    Assignment2
    Ungraded Homeworks (see homework section)

    Group Project 2004

    2004 Groups and their Topics.

    2004 Group Project Websites

    Group One: Multi-Relational Data Mining
    Group Two: Scalable Clustering Algorithms
    Team 3: Sequence Data Mining
    Team 4: Grid Data Management
    Team 5: the Semantic Web Group (I also strongly recommend that you read The W3C Semantic Web Activity Statement that is part of the review material for the COSC 6340 final exam).
    Team 6: Constructing Data-centric Web Applications

    Fall 2003 Lab Projects for COSC 3480

    Project1
    Project2
    Project3
    Project4 (still subject to change)
    Project5 and Project6

    Remark: If you never used relational database management systems before, it might not be a bad idea to do some of those projects.

    Oracle 9i

    If you are interested in installing Oracle on your own computer go to the Oracle website, register as a user, and click the download button, and follow our Installation Instructions. Moreover, Oracle 9 is also installed on machines in the 3rd 5th floor labs in the PGH building. Each student receives an account number that allows him/her to use the departmental version of Oracle.

    Lecture Schedule Spring 2004

    see Course Information and Introduction to Data Management

    Elements of COSC 6340 in Spring 2003

  • Review Exam (Feb. 13) --- weight 16% (Review List for the Feb. 13, 2003 Exam; "Old" Exam0 with Solutions); 2003 Exam0 with Solutions); Review List for the Feb. 13, 2003 Review Exam for second edition and a Review List for the Feb. 13, 2003 Undergraduate Material Review Exam for the third edition.
  • Midterm Exam (March 25) --- weight 25% (Review List for Midterm Exam; 2001 Midterm Exam with Solutions; 2002 Midterm Exam with Solutions; 2003 Midterm Exam with Solutions)
  • Final Exam (Tu., May 6, 11a) --- weight 33% (Review List for the Spring 2001 Final); Review List for the 2003 Final Exam.
  • 2 projects and 2 homeworks --- weight 26% (due dates: TBDL) The displayed weights for Spring 2003 are preliminary and subject to change.

    Class Transparencies Spring 2003

  • 0 Course Information and Introduction to Data Management (updated on February 16, 2004)
  • I Basic Concepts of Databases
  • II Implementation of Relational Operators, Query Optimization, and Physical Database Design
  • III Relational Database Design
  • IV Introduction to KDD and Making Sense of Data
  • V Internet Databases and XML
  • VI Object-Oriented Databases

    Relevant pages from the Han/Kamber book in 2004

  • chapter 1 pages 1-9 (sections 1.1 and 1.2 only)
  • chapter 2 pages 39-68
  • chapter 6 225-239 pages (excluding 6.2.4), 244-259 (excluding 6.5), 269-271 (6.7 summary only).
  • pages chapter 7 pages 279-283
  • pages chapter 8 335-354 and 370-376 (grid-based methods)

    Important Database Conferences

    International Conference Conference on Data Engineering (ICDE)
    International Conference on Very Large Databases (VLDB)
    International ACM SIGMOD Conference on Management of Data

    Important KDD Conferences

    KDD
    PKDD
    ICDM

    2003 Projects and Graded Homeworks

    2003 Specification Graded Homework1 (due We., March 19 (electronically) and Tu., April 1 (hard copy only))
    Specification Graded Homework2 2003 (due on Saturday,April 26, 2003, 11p (electronic submission)) new!!
    2003 Project1 (due Sa., March 15, 2003, 9p; electronic submission)
    2003 Project2 (student presentations are scheduled for April 17 and April 22, 2003).

    Spring 2003 Project2 Groups

    Decision Tree Learning for Large Data Sets --- Group1
    Decision Tree Learning on Very Large Data Sets --- Group2
    Clustering Large Data Sets Group
    Genomic Database Group
    Querying and Mining Data Streams Group
    Iceberg Queries Group

    Grading

    There will be an undergraduate material review exam, a midterm exam, and a final exam. Each student has to have a weighted average of 74.0 or higher in the exams of the course in order to receive a grade of "B-" or better for the course. Students will be responsible for material covered in the lectures and assigned in the readings. All homeworks and project reports are due at the date specified. No late submissions will be accepted after the due date. This policy will be strictly enforced. Course grades will be computed using a weight of approx. 72% for the exams, and a weight of 28% for the homeworks/assignments/group projects.

    Translation number to letter grades:
    A:100-90 A-:90-86 B+:86-82 B:82-77 B-:77-74 C+:74-70
    C: 70-66 C-:66-62 D+:62-58 D:58-54 D-:54-50 F: 50-0

    Only machine written solutions to homeworks and assignments are accepted (the only exception to this point are figures and complex formulas) in the assignments. Be aware of the fact that our only source of information is what you have turned in. If we are not capable to understand your solution, you will receive a low score. Moreover, students should not throw away returned assignments or tests.

    Students may discuss course material and homeworks, but must take special care to discern the difference between collaborating in order to increase understanding of course materials and collaborating on the homework / course project itself. We encourage students to help each other understand course material to clarify the meaning of homework problems or to discuss problem-solving strategies, but it is not permissible for one student to help or be helped by another student in working through homework problems and in the course project. If, in discussing course materials and problems, students believe that their like-mindedness from such discussions could be construed as collaboration on their assignments, students must cite each other, briefly explaining the extent of their collaboration. Any assistance that is not given proper citation may be considered a violation of the Honor Code, and might result in obtaining a grade of F in the course, and in further prosecution.

    CS lab: you are responsible to protect your own files. If you leave files on computers and other students turn in these files as their solution for course project, you are violating the university's academic honesty code. One way, to prevent that your solutions are copied by other students, is to edit and save all your sql files into your floppy disk (and run the system from using the data on the floppy). Alternatively, you could create local folders with your files on the hard drive and remove all files from the folder before you logoff (you could even write a script that does the cleanup).

    Communication with the teaching staff

    We strongly encourage students to come to my office hours or to talk to me directly after class. If a homework clarification is posted after a student has completed an assignment, the student should contact us as soon as possible to check if the assumptions s/he made are going to be accepted.

    Please do not e-mail us with grading questions. If you want us/me to explain why I took points off, you can talk to me/us during our office hours.

    Course Exams

    UG Material Review Exam

    Review sheet for the 2004 UG Review Exam (to be given on Th., Feb. 26, 2004).
    Solutions to a Sample Exam0 (in Word)
    Number Grades Review Exam
    Review List for 2002 Exam0)

    Midterm Exam

    Exam1 (March 30, 2000) (in Word)
    2000 Midterm Solutions (in Word)
    Solution Sketches April 2002 Midterm Exam

    Final Exam

    The 2002 final exam is scheduled for Tuesday, May 6, 11a-2p in ... --- be aware of the room change!!
    Review list for the 2002 Final Exam (new!!!)
    2000 Final Exam (was given on May 9, 2000)
    Solution Sketches Spring 2002 Final Exam (was given on May 7, 2002)

    Database Qualifying Exam

    The first part of the qualifying exam will be the COSC 6340 final exam. The second part of the qualifying exam will be a separate 90 minute exam that centers on the contents of 4 scientific papers and on material that was covered in the midterm exam (but not in the final exam). The second part is scheduled for Friday, May 9, 2003 10:30p-noon in room 315 PGH . 5 Scientific Papers for the 2003 QE (you can checkout the papers for copying during my officehours; but the papers can also easily be found by doing a websearch. Review List for Part2 of the Qualifying Exam

    Course Homeworks and Projects Spring 2000

    Useful Links

    last updated: May 17, 2004, 11a