Kieran kieran@snort.social 3 weeks ago GitHubggml : add CPU TurboQuant KV cache types (TBQ3_0 / TBQ4_0) by elusznik · Pull Request #21089 · ggml-org/llama.cppSummary This PR adds CPU-only TurboQuant KV-cache support for two new cache types: tbq3_0 tbq4_0 The scope is intentionally narrow for the first ...
captjack 🏴☠️✨💜 captjack@plebchain.club 3 weeks ago should work then - expert are doing in different ways whichever works best stick to that