Lecture6 注意力机制(Self-attention)

16 Jul 2024 317字 2分
CC BY 4.0 （除特别声明或转载文章外）

课件下载 Lecture6

Input

一段文字，语音，图这些都可以看作一组长度不定的 vector 作为 Self-attention 的输入

Output

N to N

N to 1

N to N’

Self-attention

Background

Framework

可以采用一层的 Self-attention 也可以使用多层的 Self-attention

Algorithm

Matrix representation

Multi-head Self-attention

Positional Encoding

Applications

Postscript

Self-attention 和 CNN 的关系

Self-attention 和 RNN 的关系

Self-attention for Graph