how much are u spending per month on llms? #asknostr
Login to reply
Replies (29)
<100 for now
$400
0
The biggest Claude max plan, a codex plan, and a m4 studio's power bill ... oh and Midjourney. Whatever that adds up to.
Fuck i didnt want to replay here ๐
Shoulda used wisp, amethyst confusing:wisp_cowboy:
was spending >1k, now spending 0$
HW + POWER only
which one did u buy ? H100 or 5090 ?
B200
right way
wasnt too bad price, got it here: https://www.scan.co.uk/products/dgx-b200-8x-180gb-full-with-business-std-support-3y
~$100
use "turboquant" various version in git actually - only one got success is llama.cpp one
it can load even higher models i heard
Did they merge support for that already?
one in python package - didnot nudertsand how to use "uv pip install transformers torch accelerate urboquant-gpu"
llama.cpp - compiled it - then can get turbo option
./build/bin/llama-server -m model.gguf --cache-type-k turbo3 -fa on
vanilla compile will not have that
have to download high model later test if turbo thing really works in fact
took a while larn prep it fast
Especially true in longer comment sections.
But I'm confident that you're using it right.
Codex, Gemini normie, cursor, random apis
Im running latest release b8705 but its not listed as a kv cache type yet
should work then - expert are doing in different ways
whichever works best stick to that
beat me to it!
1 (future) Lambo per month. Give or take a few.
how much is that in euro?
I don't know. I don't have Euros. ๐
how do u pay for daily expenses? don't you live in the EU? all my bills are in euros
I pay my bills and the rest goes into BTC. Holding EUR longer than needed feels awkward.