Project P11

Mini Attention Notebook

Concept: Dot Product Self-Attention Calculations

Self-attention calculations are frequently treated as opaque library calls, making it hard to debug vector representations.

It teaches Query/Key/Value projection transforms, scaling factors normalization, and Softmax attention mapping.

Self-attention is the core mathematical mechanism driving sequence modeling in transformers, determining how tokens attend to each other.

Dynamic math workbook running forward projections, scaling matrix outputs, and plotting soft heatmaps.

Character Tokens ArrayQKV Projections WeightsQK Dot Product ScoreScaling Normalized MatrixAttention Weights Heatmap

ReactTypeScriptMath.js

Defense Concept:

Why is self-attention scaled by the square root of key dimension dimension sizes?

GitHub Repo:GitHub

Live Demo:Live Demo

Verification Audit

Repository Checked: Yes

Repository Exists: Yes

Live Demo Verified: Yes

Demo Exists: Yes