
A new Manulife partnership allows Gene Cheung, a professor in the Lassonde School of Engineering, to build on his pioneering research making AI smaller in ways that could benefit both the environment and the practical use of large language models.
In recent years, AI tools have become widely adopted across a range of sectors, supporting many applications. Many of these tools are powered by deep learning (DL) models, which learn patterns directly from large volumes of data by adjusting internal parameters – the settings a model changes as it learns. An example of this is the large language model (LLM).
For some organizations, however, adoption can prove challenging. “One main challenge is the cost and time required to train billions of parameters from collected data, and the cost of using LLMs for interference, which generate text answers, images, voices and vidoes based on requests,” says Cheung. “Its sheer scale translates to large daily operation costs for businesses.”
There are also environmental considerations. The longer it takes to train an AI model, the more electricity is consumed. “This is why AI tech companies currently demand huge energy supplies and are building nuclear plants to meet their needs,” he says. Much of this electricity comes from carbon‑intensive sources, contributing to greenhouse gas emissions and environmental impact.

Over the last several years, Cheung, a faculty member in the Department of Electrical Engineering and Computer Science, has worked with international partners to develop smaller, more efficient alternatives.
His research focuses on miniaturizing transformer models, a type of deep learning model that excels at discovering complex patterns in data – from language and images to time‑series signals – by analyzing relationships across all parts of the input.
“If we can miniaturize DL models by one to two orders of magnitude, that would mean much smaller training and operation costs,” he says. “It would also save a lot of electricity and thus be environmentally friendly.”
To date, Cheung’s research group at York has applied miniaturized models to a variety of areas, including imaging, traffic and weather data, and even brain signals measured by EEG. They have already reduced model sizes by up to 100 times without noticeable drops in performance. Their smaller models can perform image processing tasks, such as denoising and interpolation, as effectively as much larger state‑of‑the‑art systems, while using only a fraction of the model parameters.
Recently, Cheung and his collaborators received an opportunity to explore a new application of their innovative methods: LLMs, some of the largest and most widely used AI systems today. These models are trained on massive collections of text to understand and generate human language.
Last summer, Cheung connected with Eugene Wen, vice-president and global chief data scientist at Manulife, a multinational insurance and financial services company investing in advancing its AI capabilities and applied AI research.
"Our goal is to build advanced AI solutions that balance high-speed and accuracy with low energy consumption to reduce costs and our carbon footprint," says Wen.
The company was seeking an LLM that could answer customer queries quickly and accurately while using minimal computing resources and electricity, helping to keep costs and energy use low. Manulife provided funding for Cheng's research, including support for PhD students to participate, continuing its commitment to parter with universities on joint research projects.
Now, in partnership with Manulife, Cheung is pursuing this project in collaboration with his graduate students and Professor Vicky Zhao, a longtime friend and research partner from Tsinghua University in China.
Building on their previous work, Cheung and his collaborators are exploring a novel approach to applying their miniaturization techniques to LLMs. In an earlier project, they trained a parameter-efficient graph-based denoiser – an AI system that gradually removes noise from grainy images to produce a clear result using a learned similarity graph.
Generating text from scratch in an LLM can also be interpreted as a sequence of denoising steps, so that the developed denoiser can be redeployed in the language context. By training the generative model as sequential denoising stages, they hope to reduce the number of parameters needed, speed up training and lower energy use. This could create smaller, faster and more efficient LLMs.
Cheung says the work with Manulife also allows him to pursue his broader research philosophy. “The main driver of my research is to understand,” he says.
He notes that most off‑the‑shelf LLMs operate like black boxes, with limited visibility into why different operations are stacked together in particular configurations. By applying his miniaturization techniques to LLMs, he can test these ideas on a new type of AI system, learning what the model truly needs to know and reducing unnecessary complexity.
“As signal processing researchers, my colleagues and I strive to understand systems in a more fundamental way so that we learn only what needs to be learned – the ‘known unknowns.’ In so doing, we reduce model parameters,” he says. This approach helps create smaller, more efficient language models and furthers his goal of understanding AI at a foundational level.
Manulife data scientist are part of the research team, providing data and experience in building generative AI solutions to solve business challenges. This arrangement allows Cheung and his team to continue research and refine models for real-world imipact, while Manulife can explore practical applications, such as reducing operational costs and environmental footprint. It also enables Cheung to pursue the broader objectives that drive his work.
“We hope that the long‑term impact of the research is to enable more frugal model learning that is more energy‑efficient and environmentally friendly,” he says. “That way everyone can benefit from the power of AI without paying a substantial environmental price.”
