Publications

Deep learning, transformers and graph neural networks: a linear algebra perspective

Abdelkader Baggag, Yousef Saad
Numerical Algorithms (2025)

Distortion-aware Brushing for Reliable Cluster Analysis in Multidimensional Projections

Hyeon Jeon, Michael Aupetit, Soohyun Lee, Kwon Ko, Youngtaek Kim, Ghulam Jilani Quadri
IEEE Transactions on Visualization and Computer Graphics (2025)

ArnoldiGCL: Graph Contrastive Learning via Learnable Arnoldi-Based Guided Spectral Chebyshev Polynomial Filters

Mustafa Coşkun, Abdelkader Baggag, Mehmet Koyutürk
Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2025)

HCT-QA: A Benchmark for Question Answering on Human-Centric Tables

Mohammad Shahmeer Ahmad, Zan A Naeem, Michael Aupetit, Ahmed Elmagarmid, Mohamed Eltabakh, Xiasong Ma, Mourad Ouzzani, Chaoyi Ruan
arXiv preprint arXiv:2504.20047 (2025)

Measuring the Validity of Clustering Validation Datasets

Hyeon Jeon, Michael Aupetit, DongHwa Shin, Aeri Cho, Seokhyeon Park, Jinwook Seo
IEEE Transaction on Pattern Analysis and Machine Intelligence (2025)

Fanar: An Arabic-Centric Multimodal Generative AI Platform

Fanar Team: Ummar Abbas, Mohammad Shahmeer Ahmad, Firoj Alam, Enes Altinisik, Ehsannedin Asgari, Yazan Boshmaf, Sabri Boughorbel, Sanjay Chawla, Shammur Chowdhury, Fahim Dalvi, Kareem Darwish, Nadir Durrani, Mohamed Elfeky, Ahmed Elmagarmid, Mohamed Eltabakh, Masoomali Fatehkia, Anastasios Fragkopoulos, Maram Hasanain, Majd Hawasly, Mus’ab Husaini, Soon-Gyo Jung, Ji Kim Lucas, Walid Magdy, Safa Messaoud, Abubakr Mohamed, Tasnim Mohiuddin, Basel Mousi, Hamdy Mubarak, Ahmad Musleh, Zan Naeem, Mourad Ouzzani, Dorde Popovic, Amin Sadeghi, Husrev Taha Sencar, Mohammed Shinoy, Omar Sinan, Yifan Zhang, Ahmed Ali, Yassine El Kheir, Xiaosong Ma, Chaoyi Ruan
arXiv preprint arXiv:2501.13944 (2025)

RetClean: Retrieval-Based Data Cleaning Using LLMs and DataLakes

Zan Ahmad Naeem, Mohammad Shahmeer Ahmad, Mohamed Y Eltabakh, Mourad Ouzzani, Nan Tang
Proceedings of the VLDB Endowment (2024)

A pragmatic perspective on AI transparency at workplace

Ghanim Al-Sulaiti, Mohammad Amin Sadeghi, Lokendra Chauhan, Ji Lucas, Sanjay Chawla, Ahmed Elmagarmid
AI and Ethics (2024)

Classes are Not Clusters: Improving Label-Based Evaluation of Dimensionality Reduction

Hyeon Jeon, Yun-Hsin Kuo, Michael Aupetit, Kwan-Liu Ma, Jinwook Seo
IEEE Transactions on Visualization and Computer Graphics (2024)

Cross Modal Data Discovery over Structured and Unstructured Data Lakes

Mohamed Y. Eltabakh, Mayuresh Kunjir, Ahmed Elmagarmid, Mohammad Shahmeer Ahmad
Proceedings of the VLDB Endowment (2023)

RPT: relational pre-trained transformer is almost all you need towards democratizing data preparation

Nan Tang, Ju Fan, Fangyi Li, Jianhong Tu, Xiaoyong Du, Guoliang Li, Sam Madden, Mourad Ouzzani
Proceedings of the VLDB Endowment (2021)

Deep Learning for Blocking in Entity Matching: A Design Space Exploration

Muhammad Ebraheem, Saravanan Thirumuruganathan, Shafiq Joty, Mourad Ouzzani, Nan Tang
Proceedings of the VLDB Endowment (2021)

Steering Distortions to Preserve Classes and Neighbors in Supervised Dimensionality Reduction

Benoît Colange, Jaakko Peltonen, Michael Aupetit, Denys Dutykh, Sylvain Lespinats
Proceedings of NeurIPS (2020)

Toward Perception-Based Evaluation of Clustering Techniques for Visual Analytics

Michael Aupetit, Michael Sedlmair, Mostafa M. Abbas, Abdelkader Baggag, Halima Bensmail
Proceedings of the IEEE Visualization Conference (2019)

DeepER–Deep Entity Resolution

Muhammad Ebraheem, Saravanan Thirumuruganathan, Shafiq Joty, Mourad Ouzzani, Nan Tang
Proceedings of the VLDB Endowment (2018)

Multidimensional Projection for Visual Analytics: Linking Techniques with Distortions, Tasks, and Layout Enrichment

Luis Gustavo Nonato, Michael Aupetit
IEEE Transactions on Visualization and Computer Graphics (2019)

Rayyan—a web and mobile app for systematic reviews

Mourad Ouzzani, Hossam Hammady, Zbys Fedorowicz, Ahmed Elmagarmid
Systematic Reviews (2016)

Machine Learning-Driven Insights and Predictions for CO2 Adsorption in Metal-Organic Frameworks

Skander Charni, Raeesh Muhammad, Abdulkarem I. Amhamed, Brahim Aissa, Halima Bensmail
International Conference on Thermal Engineering (ICTEA) (2025)

Comprehensive Analysis of Rare Variants Associated with Genetic Predisposition to Non-BRCA Familial Breast Cancer Among Arabs

Ehsan Ullah, Hikmat Abdel-Razeq, Sana Bentebbal, Abdullah Shaar, Nehad Alajez, Mohamad Saad, Julie V. Decock
Clinical Cancer Research (2025)

Genome‐Wide Association Study for Resting Electrocardiogram in the Qatari Population Identifies 6 Novel Genes and Validates Novel Polygenic Risk Scores

Nahin Khan, Abdullah Shaar, Khalid Kunji, Atlas Khan, Mohamed Elshrif, Mohammed Bashir, Mohammed Thamer Ali, Ayman Al Haj Zen, Krzysztof Kiryluk, Georges Nemer, Akl C. Fahed, Mohamad Saad
Journal of the American Heart Association (2025)

Tisslet: Tissues-based Learning Estimation for Transcriptomics

Ahmed Miloudi, Aisha Al-Qahtani, Thamanna Hashir, Mohamed Chikri, Halima Bensmail
BMC bioinformatics (2025)

Multi-omics and machine learning reveal context-specific gene regulatory activities of PML::RARA in acute promyelocytic leukemia

William Villiers, Audrey Kelly, Xiaohan He, James Kaufman-Cook, Abdurrahman Elbasir, Halima Bensmail, Paul Lavender, Richard Dillon, Borbála Mifsud, Cameron S. Osborne
Nature Communications (2023)

OutSingle: a novel method of detecting and injecting outliers in RNA-Seq count data using the optimal hard threshold for singular values

Edin Salkovic, Abdelkader Baggag, Ahmed Gamal Rashed Salem, Halima Bensmail*, Mohammad Amin Sadeghi
Bioinformatics (2023)

Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences

Zhen Chen, Pei Zhao, Fuyi Li, Yanan Wang, A. Ian Smith, Geoffrey I. Webb, Tatsuya Akutsu, Abdelkader Baggag, Halima Bensmail, Jiangning Song
Briefings in Bioinformatics (2020)

BCrystal: An interpretable sequence-based protein crystallization predictor

Abdurrahman Elbasir, Raghvendra Mall, Khalid Kunji, Reda Rawi, Zeyaul Islam, Gwo Yu Chuang, Prasanna R. Kolatkar, Halima Bensmail
Bioinformatics (2020)

DeepCrystal: A Deep Learning Framework for Sequence-based Protein Crystallization Prediction

Abdurrahman Elbasir, Balasubramanian Moovarkumudalvan, Khalid Kunji, Prasanna R. Kolatkar, Halima Bensmail, Raghvendra Mall
IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2018)

RGBM: Regularized gradient boosting machines for identification of the transcriptional regulators of discrete glioma subtypes

Raghvendra Mall, Luigi Cerulo, Luciano Garofano, Veronique Frattini, Khalid Kunji, Halima Bensmail, Thais S. Sabedot, Houtan Noushmehr, Anna Lasorella, Antonio Iavarone, Michele Ceccarelli
Nucleic Acids Research (2018)

What is User Engagement?: A Systematic Review of 241 Research Articles in Human-Computer Interaction and Beyond

Bernard J. Jansen, Kathleen Guan, Joni Salminen, Khloud Aldous, Soon-gyo Jung
Proceedings of the Conference on Human Factors in Computing Systems (CHI) (2025)

Cipherbot: A Learning Platform for AI-Augmented Education

Soon-Gyo Jung, Johanne Medina, Kholoud Aldous, Jinan Azem, Joni Salminen, Bernard J Jansen
Proceedings of the Augmented Humans International Conference (2025)

AI-Driven Disaster Response and Displacement Monitoring

Noora Al-Emadi, Muhammad Imran, Yin Yang, Ingmar Weber, Fabjan Lashi, Gaia Rigodanza, Ivana Hajžmanová, Ferda Ofli.
Communications of the ACM (2025)

Analysing Satellite Imagery Classification under Spatial Domain Shift across Geographic Regions

Sara Al-Emadi, Yin Yang, Ferda Ofli.
International Journal of Computer Vision (2025)

When Personas Talk to You: Evaluating the Evolution of User Personas from Static Profiles to Conversational User Interfaces

Ilkka Kaate, Joni Salminen, Soon-Gyo Jung, Trang Thi Thu Xuan, Jinan Y Azem, João M Santos, Bernard J Jansen
Proceedings of the ACM Designing Interactive Systems Conference (2025)

Benchmarking Object Detectors under Real-World Distribution Shifts in Satellite Imagery

Sara Al-Emadi, Yin Yang, Ferda Ofli.
Computer Vision and Pattern Recognition (CVPR) (2025)

Evaluating Robustness of LLMs on Crisis-Related Microblogs across Events, Information Types, and Linguistic Features

Muhammad Imran, Abdul Wahab Ziaullah, Kai Chen, Ferda Ofli.
WWW 2025 – Proceedings of the ACM Web Conference (2025)

Human-centred artificial intelligence in progressive education: unravelling the benefits and challenges in Qatar’s HEIs

Bernard J. Jansen, Soon-gyo Jung, Ali Farooq, Joni Salminen, Kholoud Aldous, Pilira Stella Msefula, Amani Alabed, Salar M. Khan, Richard O’Kennedy
The Future of Education Policy in the State of Qatar, Singapore: Springer Nature (2025)

What is User Engagement?: A Systematic Review of 241 Research Articles in Human-Computer Interaction and Beyond

Bernard J Jansen, Kathleen W Guan, Joni Salminen, Kholoud Khalil Aldous, Soon-Gyo Jung
Proceedings of the Conference on Human Factors in Computing Systems (CHI) (2025)

PersonaCraft: Leveraging language models for data-driven persona development

Soon Gyo Jung, Joni Salminen, Kholoud Khalil Aldous, Bernard J. Jansen
 International Journal of Human Computer Studies (2025)

(Won Deployed Application Award) Flood Insights: Integrating Remote and Social Sensing Data for Flood Exposure, Damage, and Urgent Needs Mapping.

 Zainab Akhtar, Umair Qazi, Aya El-Sakka, Rizwan Sadiq, Ferda Ofli, Muhammad Imran.
AAAI Conference on Artificial Intelligence (2024)

Measuring Engagement Through Remote Interactions of Customers: Introducing METRIC

Jinan Y. Azem, Joni Salminen, Soon-gyo Jung, Bernard J. Jansen
International Symposium on Networks, Computers and Communications (ISNCC) (2023)

Employing large language models in survey research

Bernard J. Jansen, Soon-gyo Jung, Joni Salminen
Natural Language Processing Journal (2023)

Understanding Audiences, Customers, and Users via Analytics

Bernard J. Jansen, Kholoud K. Aldous, Joni Salminen, Hind Almerekhi, Soon-gyo Jung
Springer International Publishing AG (2023)

Mapping Flood Exposure, Damage, and Population Needs Using Remote and Social Sensing: A Case Study of 2022 Pakistan Floods

Zainab Akhtar, Umair Qazi, Rizwan Sadiq, Aya El-Sakka, Muhammad Sajjad, Ferda Ofli, Muhammad Imran.
ACM Web Conference 2023 – Proceedings of the World Wide Web Conference, WWW (2023) 

Incidents1M: A Large-Scale Dataset of Images with Natural Disasters, Damage, and Incidents

Ethan Weber, Dim P. Papadopoulos, Agata Lapedriza, Ferda Ofli, Muhammad Imran, Antonio Torralba.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2023)

Data-driven personas

Bernard J. Jansen, Joni Salminen, Soon-gyo Jung, Kathleen Guan
Springer Nature (2021)

CrisisMMD: Multimodal twitter datasets from natural disasters

Firoj Alam, Ferda Ofli, Muhammad Imran
Proceedings of the international AAAI conference on web and social media (2018)

AIDR: Artificial Intelligence for Disaster Response

Muhammad Imran, Carlos Castillo, Ji Lucas, Patrick Meier, Sarah Vieweg.
Conference on World Wide Web WWW (2014)