Big Data Systems Research Group

About us

The Data-Intensive Parallel Algorithms for AI (DIPAAI) group at UH focuses on developing parallel, scalable, algorithms, minimizing I/O, to analyze data sets for machine learning (e.g. neural networks including deep learning, transformers, clustering,classification, regression, dimensionality reduction, variable/feature selection, time series), and graph analytics (connectivity, paths, cliques, vertex neighborhood, summarization). Our goal is to develop novel algorithms working all the way from reading the data set from secondary storage, processing it in main memory exploiting modern CPUs with a small RAM footprint, without reaching computer capacity. On the other hand, we develop algorithms and data structures that can exchange and read data in diverse formats, to analyze them with flexibility and high speed (e.g. streams). In the past our research focused on parallel DBMSs (parallel SQL) and ''Big Data'' Hadoop system (MapReduce, Spark). Today we have shifted to how theory can help AI and data science. Currently, most of our cofe development uses Python, Pascal, C++ and SQL.

Research Topics

Parallel data-intensive algorithms: neural networks, machine learning, graphs and cubes.

Optimizing, merging numerical methods and algorithms

Time complexity and parallel speedup analysis of algorithms.

Data science applications: heart disease diagnosis, variable selection for cancer, saving energy with solar power, explaining water pollution, estimating and saving energy in large-scale query processing.

Database systems (before): extending ER database models to manage data pre-processing, managing analytic workflows, detecting and solving data quality issues, querying source code, recursive queries, joins on graphs, cubes, skylines, pivoting, workload optimization, data partitioning.

Director

Prof. Carlos Ordonez
Department of Computer Science
University of Houston
Houston TX, 77204
contact: carlos .AT. uh .DOT. edu

Contact

University of Houston - Department of Computer Science - Big Data Systems Research Group