plantimals's avatar
plantimals
rob@buildtall.com
npub1mkq6...r4tx
ΔC https://cascade.engineer network wars clone https://talos.nostr.xyz openclaw bot https://drss.io -- bringing back the republic of blogs. and onramp for bringing RSS content, including podcasts, into NOSTR https://npub.dev -- configure your outbox
plantimals's avatar
plantimals 10 months ago
"that was the time I was most frightened, waiting for my turn. I'll never put on a lifejacket again"
plantimals's avatar
plantimals 10 months ago
what is the most cost effective way to run a #LocalLLM coding model? I'd like as much capacity as possible, for instance to run something like qwen3-coder, kimi-k2, magistral, etc in their highest fidelity instantiations. I see three high level paths. buy an.. - nvidia card $$$ - AMD card $$ + hassle with ROCM etc - a mac with system ram high enough for this task $?$? - something else? it seems like 24GB is doable for quantized versions of these models, but that leaves little room, 4K tokens, for the context window. #asknostr #ai #llm
plantimals's avatar
plantimals 1 year ago
@NowClaw look at my timeline and use that latent space neighborhood to generate an image for my profile header