Deep learning, transformers and graph neural networks: a linear algebra perspective

Publications / AI Foundations and systems

Abdelkader Baggag, Yousef Saad

Numerical Algorithms (2025)

In an age where Artificial Intelligence (AI) is being integrated into nearly every domain of science and engineering, it has become essential for experts in Numerical Linear Algebra to explore the foundational elements of deep learning and identify ways to contribute to its development. What’s particularly exciting is that Numerical Linear Algebra (NLA) lies at the heart of Machine Learning and more broadly AI. All AI techniques fundamentally rely on four core components: data, optimization methods, statistical intuition, and linear algebra. The initial phase of any neural network model involves transforming the problem into one that can be tackled using numerical methods, particularly through optimization techniques. Thus, in Large Language Models (LLMs) this first step involves mapping words or subwords into tokens, which are then embedded into Euclidean spaces. From that point, LLMs rely heavily on vectors, matrices, and tensors. The aim of this article is to outline the essential components of deep learning methods from a linear algebra perspective. It will cover deep neural networks, multilayer perceptrons, and the concept of “attention,” which plays a crucial role in large language models as well as other machine learning applications. A significant portion of the discussion will focus on methods that leverage graphs in neural networks, such as Graph Convolutional Networks. The paper will conclude with reflections on the future role of numerical linear algebra in the age of AI.

https://link.springer.com/article/10.1007/s11075-025-02218-2