Xiaohui Yu
I am Professor and the Director at the School of Information Technology, York University. I received my BSc degree from Nanjing University, China, MPhil from the Chinese University of Hong Kong, and PhD from the University of Toronto. I am affiliated with the graduate programs in Information Systems & Technology, Electrical Engineering & Computer Science, and Mathematics & Statistics at York University.
My research is broadly situated in data-centric artificial intelligence (AI), with a particular focus on how data quality, management, and representation shape the effectiveness and reliability of machine learning (ML) systems. I study the foundations and systems aspects of data-centric AI by (1) designing ML-driven database components for efficient and intelligent query processing, (2) developing algorithms and infrastructures that make large-scale ML more robust, adaptive, and efficient, and (3) creating data preparation, valuation, and governance frameworks that enable trustworthy and scalable AI. I also work extensively with spatio-temporal data arising from intelligent transportation systems, location-based services, and social networks. My work emphasizes building AI that is not only model-driven, but also data-aware and context-sensitive, ensuring impact in high-stakes, real-world applications.
I am grateful for the generous support from NSERC, ORF-RE, and leading industry partners. Our research results have appeared in top-tier venues such as SIGMOD, VLDB, ICDE, KDD, and TKDE. I currently serve as Associate/Area Editor for TKDE, TKDD, and Information Systems, all of which are leading journals in data management and data science.
I am looking for self-motivated students to join our research group as PhD/Master's students or Post-doctoral Fellows. Please drop me a line if you are interested.
News
- August 2025: Served as General Co-Chair of ACM SIGKDD 2025, the world's premier data-science and AI conference, held in Toronto.
- August 2025: Our paper "Selective Cloud Offloading for Accurate and Efficient Object Detection" was accepted to ICDM 2025, a premier conference in data mining. Congratulations to Davood Dehghani as the first author.
- August 2025: Our paper "Approximating Gradient-Based Influence for Scalable Instruction Data Selection" was accepted to CIKM 2025, a premier conference in data science and AI. Congratulations to Mohammad Hasanloo, my first-year Master's student in IST, as the first author.
- August 2025: Our paper "Visualization-Oriented Progressive Time Series Transformation" was accepted to SIGMOD 2026, the top conference in data systems. Special thanks to my wonderful collaborators: Lingyu Zhang, Xin Chen, Huaiwei Bao, Wei Lu, Eugene Wu, and Yunhai Wang.
- July 2025: Congrulations to Davood Dehghani for successfully defending his Master's thesis!
- May 2025: Received NSERC Alliance funding with IBM as the industry partner for the project Optimized data compression and scalable verification for efficient data migration.