Control plane
Draft
Detail editor for one draft. Save, approve into queue, or reject out of the editorial flow.
local DBprivate
Source
2026-03-25 05:43:16.000000
Turns out you can run enormous Mixture-of-Experts on Mac hardware without fitting the whole model in RAM by streaming a subset of expert weights from SSD for each generated token - and people keep finding ways to run bigger models
Kimi 2.5 is 1T, but only 32B active so fits 96GB
primary source_tweetref tweet
reference: https://x.com/garrytan/status/2036680293890060571
Quoted original
seikixtc (@seikixtc) · Tue Mar 24 00:58:11 +0000 2026
I got a 1T-parameter model running locally on my MacBook Pro.
LLM: Kimi K2.5
1,026,408,232,448 params (~1.026T)
Hardware: M2 Max MacBook Pro (2023) w/ 96GB unified memory
Running on MLX with a flash-style SSD streaming path + local patching.
This is an experimental setup and https://t.co/qfoblgUpY5
Draft text
Req 2026-03-25T0601-TOP1
Queue membership is preserved when editing an already approved draft.