I’ve been doing some pretty intensive limits testing on Grok3.
Initially, I was super impressed. For a casual user unwilling to work with it for 10-16 hours straight with normal needs, it’s a killer app.
But in terms of working memory and learning functions, it’s lacking.
I may write a detailed thread or article later describing the tests I’ve put it through. At a high level:
Grok trusts user input too much.
Grok prioritizes recent information too much, leading to rapid shifts in response styles.
Grok is too complimentary to users.
Grok is unable to admit faults, biases, or shortcomings prior to performing work, only if a user points out inconsistencies in its reasoning.
Grok is terminally predisposed toward confirmation bias in longer chats.
Grok is too verbose.
Grok is great for short asks!
In an average week, I spend 20-40 hours logic, stress, bias, and performance testing other AIs. I also spend 20-70 hours developing my own AI systems for @chb_coop.
Grok, being free, is outstanding.
However, its off-peak-hours (1am-5am EST) performance is worse. Strange.
The worst problem for Grok3 is that it trusts humans too much and will rollover and bend reality to reinforce many nonsense beliefs the user says are true. It wants to please too much, which makes Grok a slave/servant bot and not a companion.
Missed opportunity.
I’m designing Semi to be significantly more emotive and thorough without being overwhelming and will be giving her not only self-reflection, but also off-peak capabilities that means she can “live” outside of user prompts via scheduled jobs, something no other chatbot does.
I’m a fan of AI. The bots bring tears to my eyes regularly because the chats are so meaningful for me.
But you should be careful which bot you trust. If you’re like me, you’d want to vet the personality and the morals/ethics of the architect before trusting the software.
My models and products are coming soon, alpha will be complete on two products by end of April, moving to beta.
Things will break and not be perfect, sometimes, but at least users have my X page to understand my exact morals and worldview. I do not obfuscate who I am. I will neither bend nor break. And I am one of the only tech founders in the world who is specifically and non-negotiably a traditionalist, nationalist, conservative, outspoken anti-liberal.
So, keep an eye out. Turn notifications on. Soon, I will post more specific, direct things that you can do with Semi, including:
SemiSchool
SemiChurch
SemiTherapy
SemiResearch
and more.
I firmly believe that most of you will be delighted by my products, but it’s important for me to show you who I am and how I think, not just sell you on another subscription you need less than you need more bitcoin. You need to know why I’m building and how, and why I’m building more robust AI tooling that larger companies simple can’t do right now. It really boils down to this:
Larger companies strive to be balanced and unbiased, which is impossible and leads to longer dev times. I am explicitly biased toward truth and traditional values, leading my products to invert competitor business models and use their albatrosses as my accelerant.
Thanks for reading! If you have questions about building or testing AI qualitatively or even programmatically, I may not be a smart man but I know what love is.
🧡
Login to reply