technically you could get rtx 6000 pro and get 96gb on one card
it is true that having two cards is not the same as one big, theres penalty to it and added complexity
but also comparing 128gb shared memory vs 96gb vram on gpus is not exactly the same, theres lots to consider in terms of worload and performance, sadly afaik desktop gpus don't have nvlink support anymore so you need a motherboard with good pcie 5.0 support so that you'll have max bw between cards
also depends on what you want to do with them, what kinds of models you want to run
generally the hardest workload to do locally is coding agents because they tend to like to have a lot of context so you're hitting vram limits very quickly with them and its hard to have comparable performance with small gpus
overall - buying great gpu's is always a good idea since the models are getting better and better and the tooling is maturing so two rtx 6000 can give you a lot of bang for the buck.
will it perform the same as using opus/gemini3/codex? gonna probably need to bump another order of magnitude for that ๐
you can get decent performance out of rtx 6000 with quantized models according to the internetz
Login to reply
Replies (1)
my take is that, unless you want to develop your own LLMs, I would wait for the consumer architecture to stabilize instead of being an early adopter and spending too much on a local rig.