Fabio Manganiello's avatar
Fabio Manganiello
blacklight_at_social.platypush.tech@mostr.pub
npub1v78m...kv0u
:platypush: Tinkerer and main #developer @ #Platypush :mastodon: #MastoAdmin @ social.platypush.tech :booking: Senior #software engineer @ Booking.com โš™ #Automation addict ๐Ÿค– #AI builder :linux: #Linux user since 2001 ๐Ÿ”“ #FOSS contributor :arch: Prone to unsolicited "btw I use #Arch" statements ๐Ÿก #SelfHost all #tech! ๐Ÿ”ฌ Open #science and open #data advocate ๐ŸŽถ #Music geek ๐ŸŽธ #Guitarist + occasional composer ๐Ÿ›น๏ธ #Skater ๐Ÿ„ #Surfer ๐Ÿ‘ช #Dad of a small geek ๐Ÿ‡ฎ๐Ÿ‡น โ‡’ ๐Ÿ‡ณ๐Ÿ‡ฑ
London 2016: "We want to exit the European Union without exiting the European common market" London 2023: "We want to break encryption without really breaking encryption" The British ruling class is either incompetent of dishonest. There's really no other explanation behind such a high degree of irrationality and State-sponsored imbecility.
I'm actually not entirely against AIs #scraping the web. Once the genie is out of the bottle, you can't put it back in. If there's some content out there that is freely accessible, and it can be used to make large models better, it will certainly be used - we shouldn't be too naive or ideological about that. I've always supported total freedom of scraping for everyone. I've always supported a world were all the content on the Internet can also be parsed by machines (that was the entire idea behind the semantic web). Once public content is out there, we lose control over who accesses it and for what purposes - that's simply how the web works. But if Google and Meta are suddenly in this "we โ™ฅ scraping" mood, I'd expect them to stick to their words and allow bidirectional scraping at least. As an AI geek, I'd love to train my models on large corpora of audio extracted from YouTube videos. Or what people post in public Facebook groups when particular events happen. Or how the price of a product fluctuates on Amazon as the result of several external factors. But I can't legally do any of these things. Those platforms are sealed, their APIs are very limited by design, only a limited amount of researchers can access some of that data (after signing lengthy NDAs and agreeing that the mother company will decide if the research can be published), and they will have tons of frontend-only checks to ensure that only a human downloads that content - and that they watch a sufficient amount of ads in the process. Not only - the developers scraping software like youtube-dl also get regularly harassed by Google. So how come should I tolerate a world where if you're big enough you can afford to scrape the shit out of everyone, and use that knowledge to become even bigger and more powerful, but nobody is allowed to do the same with your own content? We urgently need regulation that creates a level playing field when it comes to automated access to online information. Freedom of scraping means freedom of growing. We cant give this freedom only to those who are big enough. We need to make web scraping a fundamental human right. And large companies should be compelled with sharing their data without barriers to scrapers too, if they aren't willing to build proper APIs. Until that happens, I'll keep scraping the shit out of those bastards without feeling an inch of guilt.
When explaining why AI is a high friction industry with a few gatekeepers, we often focus on the high-level side of things (only a fee companies have both enough data and computing power to train large models etc.). And we often forget the absolute monopoly that rules anything close to the silicon. >95% of the ML models today are trained on Nvidia GPUs. And Nvidia GPUs have an absolute monopoly over both the hardware (the GPU itself) and the middleware (CUDA) in the form of proprietary tech without a single competitor.
Can you imagine buying a fridge, an oven or a washing machine, just for it to suddenly stop working after 2-3 years because the producer decides to turn off their servers? Even if the device itself is perfectly functional from a hardware point of view, its inability to call home suddenly makes it as useful as a bulky rock sitting in your house. This is exactly the situation where the "#SmartHome" sits today. I bump into news like these at least once a week. "#IoT maker X decided to stop investing into product Y. Users of Y, who probably invested a few hundreds/thousands bucks in the product very recently, will suddenly discover that their smart device is now dumb. No apology/refund is required/expected from X's side" 1. We need a regulation that guarantees proper support from smart devices. If I buy a fridge or any home appliance, I'd expect it to work for at least 10 years. The same should apply to home appliances that just so happen to run some spyware inside. Going after e-waste means also going after those who suddenly turn thousands of devices into e-waste by switching off a single server. - The smart home still has plenty of advantages if you build it yourself. Don't rely on people who just want to iterate fast and break things when it comes to the stuff that runs the things in your house.
I have (re)-installed my local #Pixelfed instance, and I definitely love the progress done in terms of UX and features. But there are still two quite important features that have been deployed on the main instance but apparently not merged upstream: 1. Login with Mastodon: I'd like to use my own main account for Pixelfed, or at least have it explicitly linked, rather than having yet another account on another service. 2. Instagram JSON import: that would really save me a lot of time. The feature is present on the main Pixelfed instance, but unless I'm missing something obvious I couldn't find a way to enable it on mine. @npub12tyk...vnqc any hopes/plans to see these features merged upstream?
(Not so) dear #Google, I appreciate your honesty in admitting that ads are the (only) way you make money. And I also appreciate that you finally decided to give users control on what ads they want to see. But to me this is too little, too late. Sending emails to Gmail accounts that I haven't used in years to tell me "hey, now you can control what ads you see" won't change my mind about you and your business practices. Especially when you folks keep pushing one abhorrent idea after the other (from Manifest v3, to FLoC, to the Web Integrity API) with the sole purpose of killing ad blockers, screen readers, trackers polluters, alternative browsers and operating systems, and any potential competitors - even if it means to make the Web worse for everyone. (Not so) dear Google, your baton-and-carrot strategy just shows how desperate you are. And you wouldn't be so desperate if you were making good profits. Which means that our strategy (using non-Chromium based browsers, avoiding all of your services, using ad blockers, blocking all 3rd-party cookies, using PiHole to redirect all of your DNS requests to /dev/null, polluting the data points sent to your trackers etc.) is working, and it's effectively attacking your bottom line. (Not so) dear Google, passive aggressive emails like these are just an invite for me and others to keep doing what we're doing. You folks exist only because you've mastered the art of spying on people to show better ads to them. If we cut both the flows of data in input and ads in output, you die. image
โ†‘