تخطي إلى المحتوى الرئيسي

قاموس

ما هو Transformer؟

The neural network architecture underlying modern LLMs and most image AI — introduced by Google in 2017 and quickly the dominant approach.

The Transformer architecture, introduced in the 2017 paper 'Attention Is All You Need,' replaced earlier recurrent (RNN, LSTM) approaches for language tasks. Its key innovation: 'attention,' a mechanism that lets the model weigh how much every input token matters to every other token, in parallel rather than sequentially. This made Transformers vastly faster to train and capable of capturing longer-range dependencies. By 2020, Transformers dominated language tasks; by 2022, vision (ViT) and audio (Whisper); by 2024, basically everything in deep learning. The original Transformer paper is one of the most cited in computer science history.

مصطلحات ذات صلة

العودة إلى قاموس AI