This is a very good idea:
Not because of the way @arthurfranca frames it, but because it fixes the problem of web clients being what the user wants them to be and truly representing the user and not some new malware everyday loaded from a URL controlled by someone else.
Login to reply
Replies (32)
I wrote about this problem here before: View article →
Ideally, in 44billion or a 44billion-like app (which should use blossom URLs to refer to apps instead of Nostr events -- or the Nostr events should reference blossom URLs) each app is a hash, and in theory you can give it any name, and once you have that hash and the associated files downloaded you're not prey to weird changes made by the author anymore, or the app going down, or the app starting to hijack and damage Nostr or decentralization, you can keep using the same app.
Of course you can always opt in to get the latest releases, but, more importantly, you should be able to set someone else as the "official" maintainer of a client, and your friends can notice that and do the same, so the original maintainer can be semi-automatically ignored if he decides to go rogue, and their fancy domain name won't help them keep tricking users anymore.
It's a long shot, but a necessary one.
(Of course none of what would be necessary if we all used native clients instead of "web".)
Someday nostr will replace "web"
I did not find source code for 44billion.net. Is there a spec for this "napp" packaging?
Parent web app could run relay conns & event storage and child apps could communicate with it over postMessage
things like @Umbrel ☂️ could be such a store as well. Off course it's not fully embedded online. But Nostr has the potential to make self hosting worth it, and fun. You could have your app depends on some other remote signer, hosted on the same server, a media proxy, and all the good stuffs to make nostr app performant.
I have to write it down. I'll release a CLI soon that handles packaging and upload, it is just missing some details already present when uploading through 44billion.net's napp store.
It uses base93 for file chunks, the most efficient space-wise JSON-safe binary-to-text encoding. The chunks also carry Merkle Mountain Range tree proofs.
I guess @Vitor Pamplona will love it cause its old NIP-95 on steroids
You should just make each Napp be an event with a list of filenames with hashes, then whoever wants to fetch that does it from the publisher's Blossom server. Once it's downloaded it's served from the local cache. There is no need or anything to gain from using this base93 madness.
lol, yes. We have base64 encrypted data everywhere on Nostr events. It's time we stop this bullshit against binary payloads. We have been doing binary on relays quite well.
it's fine for regular sized events, but it's going to take a big whack out of performance of the database finding small events when it has giant blobs randomly scattered through the kv log structures.
and it's trivial to add a blossom server to a relay, my relay already has one. in theory, supports all of the BUDs too. ah yes:
here's a reasonable starting point for implementing one in Go:
https://github.com/mleku/next.orly.dev/tree/main/pkg/blossom
Some people are incapable of thinking about secondary or tertiary consequences of their actions.
Nostr already has a bunch of very large blobs. People need to stop using KV bullshit and start building real databases because it is only going to get worse from here. Events are starting to become massive with all the tags we are adding already.
my personal definition of "normie" is precisely that.
usually it requires them to strongly believe in something that is bullshit that contradicts and interferes with them even asking the question.
curiosity is the first thing that the system has to bury to make a person normal.
A bundle event is already part of the packaging. It references files by their MMR root hash.
By using mmr, relays can be sure the publisher isn't lying about the number of chunks and that a chunk is exactly at x position. The chunk size is fixed. So relays can choose to block a chunk if its part of a huge file.
Imho blossom isn't needed. Although relays specialized on storing big events could rise, the napp files tend to be so small that we write them to the publisher's outbox relays.
lol
i can tell that you have a strong grasp on the implementation requirements for a relational database or graph database. they ALL use KV stores under the hood.
the only way real world database engines tolerate this kind of structure without degrading iteration performance is partitioning the large data from the small. or you could just say, why would you store a jpeg in an event instead of the raw binary in a blossom record? it's trivial to just store them as files with the hash as filename, and the filesystem is already optimized for traversing the metadata to find it efficiently.
Nothing is "needed", I'm just stating what is better.
Blossom is better because it's more efficient, it will work every time and not hit any barriers, it is much simpler to implement from the publisher side and from the reader side, it's also nicer to use other people's infrastructures the way they were designed to be used.
If we want this Napp model to become a standard (and we need that if we are to solve the problems I'm pointing) then these are all very valuable properties.
Yes, these are good ideas. These features could be window.nostr.napp optional add-ons.
Currently, each app handles its own ws connections and local storage as usual but the good thing is it can always consider there is always only one user, always the same one. No need to code for multi-user. While the 44billion.net is multi-user, the napp sees just one.
ok, i'm wrong about that. there are such things as column stores which use a different structure model than key/values
after a few questions drilling down to the point, this is what GPT gives me:
Short answer to your core question
Is a flat file of blobs addressable by SHA256 hash the best solution?
No, not as a single monolithic flat file.
But:
Content-addressable blobs by hash = excellent idea.
The best implementation in practice is usually:
Append-only segment files (or many files), with a hash → location index, and possibly hash-sharded directories or an embedded KV store.
This will:
Be faster and more scalable than a single flat file.
Be more efficient than a naive “one file per blob” once you hit large counts.
Play very nicely with your immutable tables.
storing events with an LSM like lmdb or badger and storing binary blobs with your regular ext4 filesystem (maybe f2fs is better for SSD) is actually the most efficient way to sort and fetch large blobs, not only that, as it points out, you need to partition the blob storage to have at least a small and then larger blob index because the more you...
you know, mix up small and big things in one storage system
the slower you can search it, especially for the small objects.
or in other words
just put a blossom server on it. regardless of anything else, base64 is 20% less efficient and that efficiency loss can't be amortized with compression. and such encoded forms also require decoding to actually pick through the content as well. simple blob files named after their hash and indexed in a table is the simplest, yet nearly most efficient way for nostr to be handling hosting blobs.
So.. ALL dbs use KV except for the real world databases that are doing it right? I know most DBs are just a glorified KV. But I also know those who actually fixed this a while back and not just deferred to the developer to manually save shit in files.
computer filesystems have been refined for structuring the data to optimize seek latency for over 50 years. there really is no place you can look in the field for a better option for finding and retrieving large amounts of data quickly.
back in the olden days, i sometimes was running my linux installation on ReiserFS, before he got put in jail for killing someone (lol, i still can't comprehend that). reiserfs was one of the most notable filesystems in the field of data storage for pionering optimizations for the big and little problem in filesystems. ext4 has a substantial amount of them and iirc, reiser was the first to have a two stage commit journal that wrote constantly append only log and a worker in the background that compacted them into the filesystem when idle, populating the necessary metadata tables.
while there may be other variants of data structure storage for fast searching indexes than LSM K/V stores, as far as i know, for general usage it's still the best strategy and is very friendly to kernel disk cache memory.
the more the iterator has to decide as it progresses about where it's reading from next, the less time it's doing the reading of the thing you want. that's why it matters, and that's why there's no sane reason to reinvent the wheel for large blob storage. very few production systems have implemented it any other way.
here's a summarry of all the innovations that ReiserFS made that are now almost universally used in operating system filesystems:

ChatGPT
Database table storage strategies
ChatGPT helps you get answers, find inspiration, and be more productive.
Correct. Very few devs have tried to go beyond dumb approaches. The question is: do you want to be part of the very few that went above and beyond to build faster systems or do you just conform with the dev's incapacity to actually do computer science and call it a day?
well, that's the neat part: at least there is one nostr relay dev who has some appreciation for the subject.
(fiatjaf is the other, his stuff is pretty damn sleek)
just had to probe gpt a bit more about reiser also, and it went into a lot of depth about the stuff in reiser4 that didn't end up being practical, as well as a last section about what things it did that a typical modern CoW filesystem would benefit from:
immutable, append only data like nostr events and blossom blobs are extremely relevant to this with regard to the question of efficiently especially large binary data alongside small regular event json data. i'm snipping this to put somewhere for later to maybe find some ways to speed up orly's database even more. i think right now, it's pretty good, and would scale to 3 figure core server rigs with RAID nvme drives pretty well but these optimizations would become more and more important the larger the database gets.
so, yeah
circling back to the OP, @Vitor Pamplona in fact, your recommendation is probably generally correct for most (newbies to sysops) tasks that nostr relay operators would want to do, without such improvements being applied.
ChatGPT
Database table storage strategies
ChatGPT helps you get answers, find inspiration, and be more productive.
Tauri version of this would be interesting. Independent of DNS + CA + host server, doesn't update without the user knowing. Could use native keystore outside the browser renderer.
I won't argue with the protocol creator: you win hahah
I will just add that the choice of using relays was also based on a longer term goal of offering a local backup to all the user data and it will be easier to implement if everything is expected to be a Nostr event.
Yeah agree. It would have access to the same napps available on the web version, very cool.
you really do like turning conversations into conflicts dont you. 🤔
We grow on conflict.
its true.
but some people hide their lack of healthy social skills behind that reality.
True. But none of us have any social skills. We just pretend for as long as we can. :)