theory extension and construction is what knowledge work will become.
we see the shift towards specification construction with AI bots. This
is something they are *not* good at. They're great at boilerplate
english and work fantastically well when given a proper theory to work
from. but novelty, not so much!
kayvr
npub17tm2...0r0j
...
targeted, task-specific, small models are fantastic to tune. clear signal
on what works and what doesn't.
adapted large models give you stability but make task-specific work
'grayer'. but they can yield fantastic results when adapted properly.
Claude wrote some code to unfreeze a 'teacher' NN feeding into an
LLM. It added an auxiliary loss instead of relying on the gradient
signal directly from the LLM. I though "that's interesting" and kept it.
Now 3 days of training and compute got flushed. All because I vibed.
There is something about being principled when working with code that
is consequential. Maybe all code.
Then again. I learned something didn't work. That's a plus!
RLVR is the technique that made Opus 'smart'. But there have been a number
of papers saying that these techniques don't teach the model anything.
Why would that be the case? It wasn't until recently that it clicked for
me. Reinforcement learning lets the model generate its own outputs to
*train* on. It's own outputs are the training targets!
The reward signal is applied to it's own output. Just like... rote learning.
So, I mean, how could it learn anything new? It's only reward signals.
Even in RLHF.
It appears that these techniques are surfacing abilities the model
already knows how to do. It just increased the speed at which it would
find a more correct solution. It is a *search* optimization. A better
*sampling* technique.
Here's the paper that started some of this discussion:
https://arxiv.org/pdf/2506.14245
llms will likely follow the same technology adoption path as bitcoin:
asics will take over inference from gpus. gpus will then be relegated
to training and experimentation only.
there was a recent demonstration of this. i'll take my claude code asic
please. puts anyone?
TMUX is the new tiling window manager. No need for DE's or compositors anymore!
Testing out kind 1 nostr notes in this new client. Next up is long-form
30023 articles.
i'm testing out a new simple nostr client. built on nak
Testing note creation in 'nak fs'.