Joe Resident's avatar
Joe Resident
npub15sas...8xgu
Working on a gardening robot called Wilbur; we need to give the power of AI to individuals or the next 30 years could be really ugly
Joe Resident's avatar
Joe Resident 9 months ago
Trump saying the Declaration of Independence was a declaration of "Unity, Love, and Respect" 🤣 Speaking of the document that basically says "That's IT, we've been oppressed too long, fuck you Britain, it's OVER, we're our own thing now, you're out" 🤣 "Unity, Love, and Respect" indeed, I'm dying, he says the funniest things sometimes
Joe Resident's avatar
Joe Resident 9 months ago
Thinking about using my real name for social media, including this account. Because I'm starting a public project, with my real face, and videos, and hardware, and don't want the mental overhead of hiding behind a pseudonym and vetting every post for identifying info. (May be related to my trace amounts of autism, lying or anything like it is extremely taxing. Not that I think pseudonyms are wrong, just that they require juggling multiple identities, similar to managing multiple realities when one has decided to lie about something). But, I'm also a privacy advocate and use a de-googled phone, encrypted email, VPN, etc. So it's against my knee-jerk tendency to maximize privacy. Good idea or not? #asknostr
Joe Resident's avatar
Joe Resident 9 months ago
Interesting paper I hadn't seen, the 'Densing Laws' of LLMs: they are getting twice as capable for the same size model every 3.3 months. Qwen 3 released today may be an emphatic continuation of the trend. Need to play with the models more to verify, but the benchmark numbers are... Staggering. Like 4 billion handily beating a 72 billion model from less than a year ago
Joe Resident's avatar
Joe Resident 9 months ago
o3 isn't as good as I hoped, but it's still an increment in the SOTA. 69% on SWE-Bench Verified! The regression line over the past 2 years still points to 100‰ by year end! Frankly I think the real story is how cheaply Gemini 2.5 is delivering 64% on SWE-Bench Exciting times! Coding with Gemini 2.5 is so satisfying, a big step up from deepseek V3.1, which is what I was using before. #ai #llm #o3