ollama with qwq or gemma3
Login to reply
Replies (4)
on mac?
Tried Qwen3 yet?
How much video RAM is needed to run a version of the models that are actually smart though? I tried the Deepseek model that fits within 8 GB of video RAM, and it was basically unusable.
I wonder what I am doing wrong. Was so excited to get this set up but at it all day and running into hick ups. Here's my chatgpt assisted question:
I tried setting up Goose with Ollama using both qwq and gemma3 but running into consistent errors in Goose:
error decoding response body
init chat completion request with tool did not succeed
I pulled and ran both models successfully via Ollama (>>> prompt showed), and pointed Goose to http://localhost:11434 with the correct model name. But neither model seems to respond in a way Goose expects β likely because they arenβt chat-formatted (Goose appears to be calling /v1/chat/completions).
@jack Are you using a custom Goose fork, adapter, or modified Ollama template to make these models chat-compatible?