My research goal is to develop scalable, interoperable, parallel algorithms
available as libraries and programs to analyze data in a broad sense.
I currently work
on adapting and optimizing a broad spectrum of algorithms in machine learning and graphs.
I conduct research on both
multi-node processing with distributed memory/storage,
as well as multi-threaded processing on multicore CPUs and GPUs.
I work on many science and engineering problems,
but most of my applied research is in medicine and healthcare.
In my lab develop programs in C++, SQL and Python, mostly in Unix.
Research topics (overview)
- Extending and optimizing data science languages, like Python and R, to analyze big data
- Scalable and Parallel algorithms for Big Data analytics: mainly machine learning and graphs.
- Eliminating RAM, interoperability and speed limitations from
data science programming languages (Python, R).
- Big data problems: classifying documents,
information retrieval on bibliography records,
keyword search, large-scale matrix multiplication,
ontology construction, linked data and semantic web.
- Data science applications in medicine.
- Database modeling:
extending ER database models to manage data pre-processing,
managing analytic workflows, solving data quality issues, querying source code.
- Query processing: recursive queries, joins on graphs, cubes, skylines, pivoting,
workload optimization, data partitioning.
Articles available for download
Click on the top menu to download
author-prepared, unofficial, versions of published articles:
conference/workshop proceedings, presentation slides.
These PDF files have 98% the same content as the official version, but with different format.
Journal articles present the most important research results in polished form,
whereas proceedings articles present recent and preliminary research.
Articles in digital libraries, grant support
DBLP (90% complete;
big subset of ACM; 2 months behind ACM)
Google Scholar (citations; 95% complete)
International Academic Service
- Associate Editor IEEE Transactions on Knowledge and Data Engineering (TKDE) 2017-2021.
- Associate Editor Data & Knowledge Engineering (DKE) 2019-2023.
- Associate Editor Intelligent Data Analysis (IDA) 2010-2013.
- Program Chair: DOLAP 2010, 2015.
- Program Chair: SADASC 2020.
- Program Chair: DaWaK 2018, DaWaK 2019.
- Program Chair: MEDI 2018.
- Program ViceChair: Big Data 2020.
- PC member: IEEE Big Data 2016, 2020,2021,2022.
- PC member: DaWaK, 2020,2021,2022.
- PC member: DOLAP 2008-2021.
- PC member: DEXA 2018-2020.
- PC member: BDA 2020.
- PC member: ADBIS 2020.
- PC member: SIGMOD 2016, SIGMOD 2017.
- PC member: AMW 2015, AMW 2016, AMW 2019, AMW 2020.
- PC member: KDD 2014, KDD 2015.
Colleagues and Collaborators
- Divesh Srivastava
ATT Labs, USA
- Mike Stonebraker
- Ladjel Bellatreche
- Il-Yeol Song
Drexel University, USA
- Joe Hellerstein
University of California at Berkeley, USA
- Ophir Frieder
Georgetown University, USA
- Chris Jermaine
Rice University, USA
- Hanna Oktaba
- Hamid Pirahesh
- Sofian Maabout
- Oscar Romero
- Ahmad Ghazal
- Javier Garcia-Garcia
- Luis B. Morales
- Esteban Zimanyi
Universite Libre de Bruxelles, Belgium
- Nicolas Lachiche
University of Strasbourg, France
- Hiroshi Oyama
The University of Tokyo, Japan