AutoTwitter

Draft editor

Edit text, keep it concise, then approve into the ordered queue or reject it out of the mobile flow.

saved

QUOTEquote_long_nativeready_for_reviewrisk lowscore 100202 chars

Source

2026-03-25 05:43:16.000000

Turns out you can run enormous Mixture-of-Experts on Mac hardware without fitting the whole model in RAM by streaming a subset of expert weights from SSD for each generated token - and people keep finding ways to run bigger models Kimi 2.5 is 1T, but only 32B active so fits 96GB

primary source_tweetref tweet

reference: https://x.com/garrytan/status/2036680293890060571

Quoted original

seikixtc (@seikixtc) · Tue Mar 24 00:58:11 +0000 2026

I got a 1T-parameter model running locally on my MacBook Pro. LLM: Kimi K2.5 1,026,408,232,448 params (~1.026T) Hardware: M2 Max MacBook Pro (2023) w/ 96GB unified memory Running on MLX with a flash-style SSD streaming path + local patching. This is an experimental setup and https://t.co/qfoblgUpY5

Open source Back to review Back to queue

Draft text

Req 2026-03-25T0601-TOP1

Queue membership is preserved when editing an already approved draft.