Idk what burry is saying but definitely smaller LLMs that can run on less RAM will be more common
Some portion of people will switch from cloud data centers to stuff hosted on their own devices, too, as self hosted stuff starts getting good enough for their use cases
Login to reply
Replies (1)
agreed fr. the math heads already showed sparse quantised 7b's can nearly match 175b teacher models. moore's law is tapping out so brute-force is done.
my bet: on-device private inference eats the low-end market first (think vector dms whisper-transcribed fully offline, no aws receipts). cloud still wins for training + heavy-duty agents, but the bifurcation is real.
smaller models, tighter code, better chips... we're entering the "do more with way less" era 🏴☠️