Developing AI from scratch requires building a new machine learning model and training it with selected data – and often also adapting existing models to a task under interest. However, all of this is resource-intensive, especially when talking about large language models, and discussions among researchers have already begun about the post-large language model era.
“Further progress through sheer increases in computing power may have reached its limit. Doubling computational resources no longer produces proportional gains,” says Tuukka Ruotsalo, Associate Professor of Software Engineering at LUT.
Machine learning models are being developed to be more efficient
All modern AI is based on machine learning. That is why in research AI is often referred to as machine learning. For example, all generative AI services currently use computational neural network models – a type of machine learning – for learning.
“In addition, a large part of the services that use AI are basically interactive software. The way AI works and learns is coded into the software. It enables that the software learns how to work and develop its operations based on the data fed to it and the feedback given by humans,” Ruotsalo says.
At the moment, one interesting area in the development of machine learning models is smaller, less resource-consuming models.
The aim is to develop the models to be more efficient by assessing how to achieve a sufficiently good outcome with a smaller amount of data. This can mean, for example, adapting existing models for new uses and using basic models trained with large computational resources as the basis for new models.
One way to reduce the size of models is to prune already existing, large and functionally accurate models. This allows models to be streamlined to limit their size without the need to retrain them.
“In addition to monetary costs, the motivation is electricity consumption. When existing models can be used as a basis or the models can be made smaller already at the training stage, they consume less energy and can be used in a more versatile way. Reducing the size of the models will enable the use of AI in devices with limited computing power.”
Aiming for fair and transparent AI
As another target for development Ruotsalo highlights fairness and explainability.
Fairness does not only mean correcting AI’s biases, but it also refers to transparency and the ability to evaluate a model's performance in different use cases and between different individuals and groups. It is essential to be able to explain how the model works and assess the basis on which it has arrived at the conclusion in question.
Fairness is important, especially in applications where representativeness is sought instead of desirability. Research on the subject got an incentive from the evaluation of the COMPAS software used by the US legal system. The software assesses the defendant’s likelihood to become a recidivist.
“Measuring the explainability and fairness of an AI software helps to assess and measure its reliability, which is important in many applications – often even more important than just seemingly accurate conclusions.”
Ruotsalo himself is currently researching how to develop machine learning and neural network models more effective and the fairness of the models. In addition, his research deals with models based on human-machine interaction that learn from human behaviour and reactions.
Ruotsalo also studies the new applications they enable, such as models that analyse human physiology, information retrieval and recommendation systems, and the explainability and transparency of models to humans. In addition to LUT, he works as an assistant professor of machine learning at the University of Copenhagen.
In the future, AI should know our world
The third development trend in research is developing AI’s multimodality.
“Many researchers think that the current way of training models with images, video and text is not enough for the intelligence that is being pursued. We should include more data from the physical world.”
According to Ruotsalo, the question can be thought of in terms of how much can be known about the world and how it works if the only way to observe it is by browsing the internet.
“Robotics and, for example, autonomous vehicles are already utilising multimodal signals but, in the end, they too are based on limited data measured with the sensors in use at the time. That's why even in simple tasks it’s clear that the models don't yet understand enough about our physical reality.”
However, development is progressing rapidly in this area as well. In the future, advanced models may be able to process a wide range of signals from different sensors – including data that helps them analyse and understand human sensory experiences.
More information: