Thread - Nostr Hypermedia

Ars Technica - All News (RSS/Atom feed) 2 months ago

Microsoft removes guide on how to train LLMs on pirated Harry Potter books Following backlash in a [Hacker News thread][1], Microsoft deleted a blog post that critics said encouraged developers to pirate Harry Potter books to train AI models that could then be used to create AI slop. The blog, which is archived [here][2], was written in November 2024 by a senior product manager, Pooja Kamath. According to her LinkedIn, Kamath has been at Microsoft for more than a decade and remains with the company. In 2024, Microsoft tapped her to promote a new feature that the blog said made it easier to "add generative AI features to your own applications with just a few lines of code using Azure SQL DB, LangChain, and LLMs." What better way to show "engaging and relatable examples" of Microsoft's new feature that would "resonate with a wide audience" than to "use a well-known dataset" like Harry Potter books, the blog said. [Read full article][3] [Comments][4] [1]:

Microsoft guide to pirating Harry Potter for LLM training (2024) [removed] | Hacker News

[2]: https://archive.is/D9vEN [3]:

Ars Technica

Microsoft deletes blog telling users to train AI on pirated Harry Potter books

The now-deleted Harry Potter dataset was "mistakenly" marked public domain.

[4]:

Ars Technica

Microsoft deletes blog telling users to train AI on pirated Harry Potter books

The now-deleted Harry Potter dataset was "mistakenly" marked public domain.

Microsoft generated an AI image of Harry Potter with a Microsoft logo in a now-deleted blog.

Ars Technica

Microsoft deletes blog telling users to train AI on pirated Harry Potter books

The now-deleted Harry Potter dataset was "mistakenly" marked public domain.