ChipTuner's avatar
ChipTuner
ChipTuner@gitcitadel.com
npub1qdjn...fqm7
Building software they don't like. Free, as in freedom. Low-level and server engineer: libnoscrypt, NVault, vnlib. Staff @GitCitadel https://geyser.fund/project/gitcitadel
ChipTuner's avatar
ChipTuner 5 months ago
I think many forgot that GFY comes after the GM. Like GM and GFY
ChipTuner's avatar
ChipTuner 5 months ago
So some prelim testing @Deleted Account @npub1sces...q7fu 2x Titan X maxwell -> 2x Tesla P100 16g gpt-oss 20b skyrocket performance increase from less than 10 t/s to almost 40 t/s it's amazing (utilizes the new FP16 compute I believe) Small models <15b (Granite, Gemma, DeepSeek)- about equal or worse performance Larger models >20b (Qwen 3, Gemma Deep Seek 32b) - Nearly double, ~4 to over 8 t/s Still pretty slow compared to really modern GPUs, but for the price, I'm not upset at all. I want to explore some driver stuff or maybe some tuning because it seems like I should be seeing better. Might just be the rest of the old server running it. Power consumption is nearly 100w less for the same loads. Idle is about 15w higher per card (they idle around 30w each instead of 15). Performance per watt has increased dramatically. Power limit was set to 150w/card in all tests titans typically went over and p100s aren't even coming close to it. Ill be playing with vGPU at a later date. I've always dreamt of a thin client setup but it's just never really worked out.