You can now run 70B model on a single 4GB GPU and it even scales up to the colossal Llama 3.1 405B on just 8GB of VRAM.
#AirLLM uses "Layer-wise Inference." Instead of loading the whole model, it loads, computes, and flushes one layer at a time
- No quantization needed by default
- Supports Llama, Qwen, and Mistral
- Works on Linux, Windows, and macOS
100% Open Source.
captjack ๐ดโโ ๏ธโจ๐
captjack@plebchain.club
npub1te0u...jgwp
captjack
72k gang ready ?
bench-marking #compare AI 

#siamstr #bot #derivative update #goldstr #gold 

Mac mini is $499 for the model with 16GB RAM and 256GB SSD - fuck apple
BOSGAME P6 #Ryzen 9 6900HX Mini PC Gaming, 32GB LPDDR5X Mini Desktop, 1TB PCIe 4.0 NVMe SSD, Dual LAN, AX210 Wi-Fi 6E, BT 5.3 $500 - run win / linux
there r others cheaper if u search around also with 32GB RAM 1TB SSD #minipc
#healthstr
View quoted note โ

20K+ masked thugs n #monsters roaming in some neighborhoods stay alert
View quoted note โ
if u missed the early internet it's ok
if u missed .com it's ok
if u missed crypto bubble of 2017 that's ok
WE NOW LIVE IN THE ERA #FOSS REVOLUTION