Skip to main content Skip to local navigation

CAIS Seminar: Rafid Mahmood

 

Register Below:

Please register for the seminar. Registration is required.

My CAIS affiliation(Required)
If you wish to become a member, please complete the membership form at https://machform.osgoode.yorku.ca/machform/view.php?id=226829
I confirm my registration for the seminar.(Required)

Abstract

Deep learning models require large and diverse data sets to achieve the performance required for deployment, but there is little guidance on how much or what kind of data to collect. In this talk, we will discuss two aspects of optimizing the data collection pipeline. We will first explore optimal data collection over the lifecycle of a machine learning product to analyze the trade-offs between over-collecting, which incurs unnecessary costs, versus under-collecting data, which may incur delays and future costs. We will then explore optimal dataset composition for LLM training to analyze the trade-offs between using different sources (e.g., ArXiv, Github, and Wikipedia) for different tasks (e.g., science, math, reasoning). In both cases, our workflow involves forecasting the scaling behavior of model performance with more data and optimizing for costs and performance.

 

Bio

Rafid Mahmood is an Assistant Professor at the University of Ottawa Telfer School of Management, as well as a Senior Research Scientist at the NVIDIA Spatial Intelligence Lab. His research interests focus on the operations management of AI systems, from data collection to model training to deployment and pricing, with applications including personalized medicine and autonomous vehicles. His work has received numerous awards including the INFORMS Pierskalla Best Paper Award in Healthcare (First Place), INFORMS Innovative Applications in Analytics Award (Runners’ Up), and MSOM Practice-Based Research Competition (Finalist). He has also been an invited  seminar speaker for workshops including the 2023 ICCV Tutorial on Learning with Noisy and Unlabeled Data for Large Models Beyond Categorization and 2024 UTM Management Analytics Research Conference. From 2019 to 2021, he was a Postgraduate Affiliate of the Vector Institute for Artificial Intelligence. He completed his BASc and MASc in Electrical Engineering, and PhD in Industrial Engineering all at the University of Toronto.

Date

Nov 25 2025
Expired!

Time

2:00 pm - 3:00 pm
QR Code