General Information

Carlos Ordonez
Associate Professor
Department of Computer Science
University of Houston
Houston TX, 77204

(713)743-3350

Carlos Ordonez studied at UNAM, Mexico, getting a B.Sc. in applied mathematics and an M.S. in computer science. He continued PhD studies at the Georgia Institute of Technology advised by Edward Omiecinski, focusing on extending database systems with data mining algorithms, getting the PhD in 2000. Carlos worked at Teradata (formerly part of NCR) from 1998 to 2006, collaborating in the optimization of machine learning and cube algorithms to work on the Teradata parallel DBMS. In 2006 Carlos joined the Department of Computer Science at the University of Houston, where he currently leads the DBMS lab. From 2013 to 2015 Carlos collaborated with Michael Stonebraker, regularly visiting the Database Group at MIT. From July 2014 to July 2015 Carlos was a visiting researcher at ATT Research Labs (formerly ATT Bell Labs), where he worked on stream analytics and data warehousing with Divesh Srivastava. His research has been funded by NSF.



Research: Parallel Database Systems, Big Data Analytics

My goal is to develop scalable and parallel algorithms to analyze large data sets with machine learning and statistical models (e.g. clustering, classification, regression, dimensionality reduction, variable/feature selection, time series), cubes (ad-hoc queries, decision support systems) and graphs (paths, cliques, vertex neighborhood). In the past I worked on sequential data mining algorithms and parallel row DBMSs with applications in corporate data warehouses and medical data. After visiting MIT I changed direction: I am currently studying how to integrate algorithms with parallel DBMSs having column and array based storage. After visiting ATT Labs I became more interested in the R language and streams. On the popular "Big Data Analytics" Hadoop side I have worked with MapReduce and Spark.
Research topics (overview):
  • Parallel algorithms.
  • Integrating machine learning and numerical methods with parallel DBMSs and Hadoop (MapReduce, Spark).
  • Eliminating RAM and parallel limitations from mathematical software (R and Matlab).
  • Query optimization: recursive queries, cubes, skylines, pivoting.
  • Semi-structured data: text, web pages, documents, ontologies.
  • Software engineering: ER modeling, data quality, debugging source code.
  • Applications: medicine, bioinformatics, financial, engineering, physics, network monitoring.
Published articles listed on: DBLP, Google Scholar.



Education

  • B.Sc. in Applied Mathematics, UNAM University, Mexico, 1992.
  • M.S. in Computer Science, UNAM University, Mexico, 1996.
  • Ph.D. in Computer Science, Georgia Institute of Technology, USA, 2000.