Alan Smeaton

Prof Alan Smeaton discusses Large Language Models (LLMs) and why there is so much discussion about them

In a piece for RTE Brainstorm, Professor of Computing at DCU, Prof Alan Smeaton discussed how this new form of generative AI with its Large Language Models (LLMs) is already having a dramatic impact on our lives in such a short period.

‘AI’ has become such a constant buzzword over the past 18 months that it is hard to believe that we have been using Artificial Intelligence in our everyday lives for over a decade now. From Netflix recommendations to medical diagnoses, this older form of AI only worked in the background to analyse content instead of generating it. 

Prof Alan Smeaton says:

“The rate of development and deployment of AI, especially generative AI, over the last 18 months has been dizzying. Most forecasts indicate that generative AI will, or is already, having an impact across many of the tasks we perform across education, finance, law, the creative arts, and more.”

This recent explosion of AI technologies was spurred largely by Open AI’s GPT-4 model used in ChatGPT, which is the most popular LLM in use by some distance. ChatGPT, as well as all subsequent LLMs deployed by tech corporations in the last year, were trained by taking inconceivably large amounts of data and turning it into an embedding space.

“Imagine a bag of coloured balls. An embedding space is a flat surface where the balls are arranged in such a way that those with similar colours are placed closer together. In an embedding space for words, words which are related like "king" and "queen", or "computer" and "keyboard" would be closer together while the words "queen" and "keyboard" are further apart. For LLMs, the number of words to be arranged, the size of the embedding space, and the time needed to train or arrange the words, are enormous.”

With almost every medium to large-sized tech company announcing their own LLM, boasting an even bigger set of training data on which their model is founded, the rate of growth in these generative AI models has been exponential. With this dramatic rate of improvement, some professionals in the area of AI assistive technology believe that with a large enough data set LLMs will be able to reason and complete tasks at a level beyond that of humans.

This revelation then necessitates the conversation around what jobs are safe from the seemingly unyielding march of AI, and which jobs are not. In stark contrast to those older AI technologies that were implemented in the background for over a decade, OpenAI has made generative AI available instantly and to everyone, without any time to implement guardrails. In reaction to this mass hunt for data to feed these LLMs the ‘EU AI Act’ will attempt to regulate how these models are trained and retain some control over their progression.  

Read the RTE Brainstorm article here: https://www.rte.ie/brainstorm/2024/0521/1450438-large-language-models-artificial-intelligence-open-ai-chat-gpt/