Control plane
Draft
Detail editor for one draft. Save, approve into queue, or reject out of the editorial flow.
local DBprivate
Source
2026-03-16 06:08:57.000000
I'm pretty confident this can be leveraged to graft a modified backwards pass onto the LM head of a pretrained model to improve the validation loss over standard LM head bwd. More to come soon.
Quoted original
Nathan Godey (@nthngdy) · Thu Mar 12 19:10:02 +0000 2026
🧵New paper: "Lost in Backpropagation: The LM Head is a Gradient Bottleneck"
The output layer of LLMs destroys 95-99% of your training signal during backpropagation, and this significantly slows down pretraining 👇 https://t.co/lnbGfesIFA
Draft text
Req 2026-03-16T0631-TOP1
Queue membership is preserved when editing an already approved draft.