General Information

Carlos Ordonez
Associate Professor
Department of Computer Science
University of Houston
Houston TX, 77204
firstname AT uh DOT edu

My research goal is to develop scalable, interoperable, parallel algorithms available as libraries and programs to analyze data in a broad sense. I currently work on adapting and optimizing a broad spectrum of algorithms in machine learning and graphs. I conduct research on both multi-node processing with distributed memory/storage, as well as multi-threaded processing on multicore CPUs and GPUs. I work on many science and engineering problems, but most of my applied research is in medicine and healthcare. In my lab develop programs in C++, SQL and Python, mostly in Unix.



Research

Bio: Carlos Ordonez studied at UNAM University in Mexico, getting a B.Sc. in actuarial science (applied math, similar to data science degrees) and an M.S. in computer science. He continued PhD studies at the Georgia Institute of Technology advised by Edward Omiecinski, focusing on accelerating machine learning algorithms, getting the PhD in 2000. Carlos worked at NCR from 1998 to 2006, collaborating in the optimization of machine learning and cube query processing algorithms on the Teradata parallel DBMS. In 2006 Carlos joined the Department of Computer Science at the University of Houston, where he currently leads the Big Data Systems (BDS) lab. From 2013 to 2015 Carlos regularly visited MIT and collaborated with Michael Stonebraker, working on new-generation parallel DBMSs (columnar, arrays). From July 2014 to July 2015 Carlos worked as a visiting researcher with ATT Labs-Research (formerly ATT Bell Labs), where he conducted research on stream analytics, extending the R language and data quality with Divesh Srivastava. His research projects have been funded by 3 NSF grants.

Research topics (overview):
  • Extending and optimizing data science languages, like Python and R, to analyze big data
  • Scalable and Parallel algorithms for Big Data analytics: mainly machine learning and graphs.
  • Eliminating RAM, interoperability and speed limitations from data science programming languages (Python, R). `
  • Big data problems: classifying documents, information retrieval on bibliography records, keyword search, large-scale matrix multiplication, ontology construction, linked data and semantic web.
  • Data science applications in medicine: heart disease diagnosis, variable selection for cancer, microarray data analysis, epidemiology (hepatitis), neurology.
  • Older topics: extending ER database models to manage data pre-processing, managing analytic workflows, solving data quality issues, querying source code. recursive queries, joins on graphs, cubes, skylines, pivoting, workload optimization, data partitioning.
Most of my scientific articles are listed on: DBLP. My citations are tracked in: Google Scholar.



Education

  • B.Sc. in Applied Mathematics (Actuarial Sciences), UNAM University, Mexico, 1992.
  • M.S. in Computer Science, UNAM University, Mexico, 1996.
  • Ph.D. in Computer Science, Georgia Institute of Technology, USA, 2000.

International Academic Service

Journals:
  • Associate Editor IEEE Transactions on Knowledge and Data Engineering (TKDE) 2017-2021.
  • Associate Editor Data & Knowledge Engineering (DKE) 2019-2023.
  • Associate Editor Intelligent Data Analysis (IDA) 2010-2013.
Conferences:
  • Program Chair: DOLAP 2010, 2015.
  • Program Chair: SADASC 2020.
  • Program Chair: DaWaK 2018, DaWaK 2019.
  • Program Chair: MEDI 2018.
  • Program ViceChair: Big Data 2020.
  • PC member: IEEE Big Data 2016, 2020,2021,2022.
  • PC member: DaWaK, 2020,2021,2022.
  • PC member: DOLAP 2008-2021.
  • PC member: DEXA 2018-2020.
  • PC member: BDA 2020.
  • PC member: ADBIS 2020.
  • PC member: SIGMOD 2016, SIGMOD 2017.
  • PC member: AMW 2015, AMW 2016, AMW 2019, AMW 2020.
  • PC member: KDD 2014, KDD 2015.