Thread - Nostr Hypermedia

saunter saunter@getalby.com 8 months ago

What are you working on currently jack?

jb55 _@jb55.com 8 months ago

whats your local llm setup?

1 replies ↓

jack 8 months ago

ollama with qwq or gemma3

3 replies ↓

jb55 _@jb55.com 8 months ago

on mac?

1 replies ↓

jack 8 months ago

yes

2 replies ↓

Rusty rusty@getalby.com 8 months ago

what the hell is goose? someone please fill me in

1 replies ↓

jamw jamw@primal.net 8 months ago

Just had a quick look into this and it seems possible to do this for free. Ie to run open-source models on the Mac like LLama 2, Mistral, or Phi-2 locally using Ollama. No internet, no API keys, no limits and Apple Silicon runs them well.

1 replies ↓

jb55 _@jb55.com 8 months ago

you can even use dave with a setup like this. fully private local ai assistant that can find and summarize notes for you View quoted note →

1 replies ↓

jamw jamw@primal.net 8 months ago

Cool. I’m still learning. So much to play with!

YS 🫂 yariksychov@nostr.com 8 months ago

Tried Qwen3 yet?

1 replies ↓

PlebInstitute 8 months ago

Would 48 gb be sufficient?

ynniv ynniv@ynniv.com 8 months ago

Which model? Been testing for 14 months now, but Claude sets the bar high

2 replies ↓

jb55 _@jb55.com 8 months ago

View quoted note →

1 replies ↓

ynniv ynniv@ynniv.com 8 months ago

Oh, you answered a reply. Primal doesn't show replies to replies by default? 🤦‍♂️

ynniv ynniv@ynniv.com 8 months ago

Wouldn't have happened in Damus 🔥

1 replies ↓

jb55 _@jb55.com 8 months ago

here to help 🤗

1 replies ↓

ynniv ynniv@ynniv.com 8 months ago

At least I have self-custodial zaps set up in Primal web now. One of these days their iOS will too... 🥂

Alex 8 months ago

This

Introducing codename goose

codename goose is your open source AI agent, automating engineering tasks and improving productivity.

ynniv ynniv@ynniv.com 8 months ago

I just run qwen3:30b-a3b with a 64k context (tweak in the modelfile) and it can do things 🤙 . Uses 43 GB

npub1asqr...t7pd 8 months ago

How much video RAM is needed to run a version of the models that are actually smart though? I tried the Deepseek model that fits within 8 GB of video RAM, and it was basically unusable.

1 replies ↓

steven ₿ stevi@primal.net 8 months ago

next step is that llms know each other’s strengths and weaknesses, and we get them to select the best llm for the particular task.

Diyana 7 months ago

I wonder what I am doing wrong. Was so excited to get this set up but at it all day and running into hick ups. Here's my chatgpt assisted question: I tried setting up Goose with Ollama using both qwq and gemma3 but running into consistent errors in Goose: error decoding response body init chat completion request with tool did not succeed I pulled and ran both models successfully via Ollama (>>> prompt showed), and pointed Goose to http://localhost:11434 with the correct model name. But neither model seems to respond in a way Goose expects — likely because they aren’t chat-formatted (Goose appears to be calling /v1/chat/completions). @jack Are you using a custom Goose fork, adapter, or modified Ollama template to make these models chat-compatible?

npub1w8p7...6keq 6 months ago

Serial killer

Replies (23)