AI Lesson & Submodule
Softmax & Sampling Mechanics
Study how raw model logits are turned into output probability distributions.
Why This Matters
Sampling mechanics explain how models choose words, determining creativity vs accuracy.
Deep-Dive Explanation
The model outputs raw values called logits for every token in the vocabulary. The Softmax function converts these logits into a probability distribution summing to 1. Temperature (T) scales the logits: Logits = Logits / T. When T is low (e.g. 0.1), the differences between logits are amplified, concentrating the probability on the absolute top candidate. When T is high, the distribution flattens, giving lower-ranked tokens a higher chance of selection.
What You Will Learn
- •How Softmax scales model output scores
- •Scaling the probability curve using Temperature
- •Pruning vocabulary candidates using Top-p nucleus thresholds
Concepts Covered
Softmax FunctionLogits Probability ScalingNucleus Pruning
Mapped Foundation Project: Hyperparameter Playground
Interactive settings dashboard to inspect how Temperature, Top-p, and penalties alter Softmax probability distributions.
Architecture Preview
Logs visualizer showing vocabulary probability bars changing dynamically as sliders scale parameters.
Raw Logits ArrayTemperature Scale FunctionSoftmax Probability Converter
Tech Stack Planned
ReactTypeScriptTailwind CSS