What is Transformer? — AI Glossary

The Transformer architecture, introduced in the 2017 paper 'Attention Is All You Need,' replaced earlier recurrent (RNN, LSTM) approaches for language tasks. Its key innovation: 'attention,' a mechanism that lets the model weigh how much every input token matters to every other token, in parallel rather than sequentially. This made Transformers vastly faster to train and capable of capturing longer-range dependencies. By 2020, Transformers dominated language tasks; by 2022, vision (ViT) and audio (Whisper); by 2024, basically everything in deep learning. The original Transformer paper is one of the most cited in computer science history.

ما هو Transformer؟

مصطلحات ذات صلة