Rusty Russell's avatar
Rusty Russell
rusty@rusty.ozlabs.org
npub179e9...lz4s
Lead Core Lightning, Standards Wrangler, Bitcoin Script Restoration ponderer, coder. Full time employed on Free and Open Source Software since 1998. Joyous hacking with others for over 25 years.
One day I will reach the pinnacle of nostr technical capabilities, and be able to access my npub.cash funds. Right, @calle?
View quoted note → I am who I am. My voice may well get drowned out by those who are louder, better resourced or more persuasive. But as an engineer working in this space, I need to tell the raw unfiltered truth as best I see it, and I hope others feel the same?
View quoted note → He explicitly argued that a miner who spends $1B on mining infra should not have their investment undermined by protocol changes. It was pretty concrete! They would definitely argue that a protocol change which increased space efficiency would have a short-term effect on fee revenue, right? It seems a pretty clear-cut case where Saylor's model is directly against most people's understanding of what we should improve.
I listened to the What Bitcoin Did Saylor podcast, and I really want to respond, though that may be unwise. But I want thoughtful, fearless content in my feed, so I should start making some, right? Firstly, while analogies can provide useful guide rails for understanding, listening to people *arguing* using analogies makes you stupider. Debate the thing itself, not the words about the thing: it hurts my head to even think about doing this, so I won't. Let's set my priors first: I assume we're talking about technically solid, well-vetted, backward compatible protocol changes: this is the minimum bar. I don't wholesale agree with Saylor's "don't threaten anyone's investment" hard limit. This has happened multiple times in the past, from the dust limit breaking SatoshiDice, enabling Lightning threatening miner fees (real or not), and segwit breaking stealth ASICBoost. These interests can, and will, stand up for themselves and will compete against other benefits of changes. To be explicit: I consider any protocol change which makes block space usage more efficient to be a win! Obviously Saylor is invested in Bitcoin the asset, and can afford to do all his business onchain in any conceivable scenario. His projection of a Bitcoin world in which there are 100,000 companies and governments who use Bitcoin as the base layer is interesting: 1. This does not need "smart contracts", just signatures. By this model, Bitcoin Script was a mistake. 2. It can work if Bitcoin does not scale and is incredibly expensive to spend and hold. By this model, the consumer hardware wallet industry is a dead-end and needs to pivot to something else (nostr keys, ecash?) 3. You could do this with gold, today? Bitcoin here is simply an incremental, not fundamental, improvement. I think this is suggestive, though: that such a network would not be long-term stable, and very much subject to capture. 4. In this view, Saylor is simply a gold bug with first mover advantage, shilling his bags. That's fine, but it's important to understand people's motivations. 5. This vision does not excite me. I wouldn't have left Linux development to work on making B2B commerce more efficient. I wouldn't get up at 5:30am for spec calls, and I sure as hell wouldn't be working this cheap. I believe we can make people's UTXOs more powerful, and thus feel a moral responsibility to do so. This gives them more control over their own money, and allows more people to share that control. I assume that more people will do good things than stupid things, because assuming the other way implies that someone should be able to stop them, and that's usually worse. I believe the result will be a more stable, thus useful, Bitcoin network. I am aware that this will certainly benefit people with very different motivations than me (Saylor). Thanks for reading, and sorry for the length!
#dev #scriptrestoration Took some time off this week, and it was slow anyway. I have ordered a proper power supply for my RPi3 as compiling Bitcoin is crashing it (which bodes badly for actually running the benchmarks!). After some head-scratching results I have returned to micro-benchmarks. My initial model treated every read and write as the same cost, but I'm investigating alternatives. In particular, the ubiquity of SIMD instructions and write combining in modern processors means some operations are *much* faster than a naive "pull 64 bits into a register, write it out". I wouldn't care about beating the model generally, but: 1. I have to make sure my benchmarks are measuring the right thing, and 2. OP_DUP, OP_EQUALS and the like are so common that if they really are twice as fast as the naive versions on every platform, it's worth considering whether the model should accommodate them explicitly. So, I'm now breaking them down into micro benchmarks for each case explicitly. There's so much data here, particularly when considering different size operand limits, that I'm going to have to produce a series of graphs to illustrate (and hopefully illuminate!) the results.
#dev #cln This morning Shahana reminded me of an idea we had earlier to create "categories" for commands, so runes can filter on that basis. But choosing the categories is a bit of a research project which we never got around to, so we sat down and did it: described using keywords of each documented command, then tried to come up with a list. The original motivation was the "read-only" commando rune: originally it allowed all list commands, and the getinfo command, except not listdatastore because that gave you access to the master rune itself! That's no longer true, but the point stands: some commands are read-only but reveal information which could allow escalation. ,(This is why when we added the command to list runes we called itshowrunes, not listrunes!). When plugins add commands, you want it to be intuitive. If a rune allows the pay command, that won't allow "mpay" for example: the rune probably wants to say "you can send lightning payments" and cover all the commands that are needed, or can be used, to do that. Anyway, the exercise resulted in a first list of categories, which we will play with as we implement: they need to be in documentation, as well as accessable to runes. And Shahana suggested the term "genus" (well , genera) instead of "category".
#dev #cln Yesterday I brunched with a friend, that turned into lunch. I try to catch up with a friend every week or so, to keep my work hours contained. Nonetheless, I continued my work on connectd: it now limits the rate it will stream gossip (1MB per second should be enough for anyone!) but this doesn't apply to responses to gossip queries, which get handed off to gossipd to answer. Connectd has the gossip store open to stream it: if it simply switches to our gossmap library it can answer these queries itself, and rate limit the processing correctly. It also means that it will be able to look up node announcements for connections itself, rather than having them fed from lightningd as now. This simplifies things, and even more once I have connectd automatically reconnect to important peers, rather than waiting for lightningd to tell it to reconnect.
#dev #cln Today was a good day! On this morning's call with Alex we discovered the cause of a INVALID_ONION error which Elle had just reproduced: LND will still send legacy onions of it doesn't see a node's announcement. We removed support for that years ago, after it was removed from the spec. Still, before breakfast I had a working and tested implementation to sneak into rc2. Not happy about it, but interop wins over my feelings unfortunately! I also finished code to properly send reestablish messages for closed channels: we weren't doing this properly since some other code changes last release, preferring to send the last error we had. Which was just as well, because doing so was triggering a corner case where we could delay on chain cleanup on older versions (embarrassingly, my logs show this happened in my own node in Feb 2022, with v0.10.2: I thought it was broken because I messed with my DB using an experimental version!). Finally, I wrote a simple benchmark to vaguely simulate the slowdown/high CPU large nodes like Boltz are seeing: I wanted to this since Michael let me strace and profile their connnectd in Austin. The actual optimization was surprisingly trivial, and makes the time spent in poll() down to noise. As a bonus, I also implemented and tested gossip rate limiting (1MB per second). PR coming tomorrow!
#dev #cln More work on askrene today. The infrastructure basically done, now I'm stealing Eduardo's min cost flow code. But even that isn't trivial to drive. You get the min cost flow but the resulting path might be too expensive, or too unlikely, or too slow, and you need to tweak parameters (fees vs probability vs delay) then ask again. The current code does a binary search for this over a mu value from 0 to 127, so 7 times. I need to play with this, as I suspect we can shortcut this if we're "good enough". Once you've got that, you look at htlc_max and htlc_min that had been published on the channels: you might have violated that too! If you hit the min, you remove that route and block the channel from future attempts. If you hit the max,, you reduce the amount you send down that route. Then you have some sats you still need to send, so you ask the min cost flow solver for the solution for just that part. This is actually unusual though: most channels route most payments. The question is, how much goes inside the pay oracle, and how much can we punt to the caller? I think the caller can only specify the max fee and max delay, and the oracle needs to give the highest probability routes within those constraints. I hope to have something tomorrow, but it's going to need a lot of tests! Fortunately this should be easier to test than a complete pay plugin.
Just confirmed that I'm heading to Cheatcode in Sydney in October! Don't know what I'll speak about, but I'm sure there will be something happening...
#cln #dev Shahana brought a request to delay startup if bitcoind had gone backwards, rather than refusing to start. This is good for umbrel and like apps where it can happen (disk corruption causes bitcoind to have to resync). Seems simple enough, but that code proved really awkward to work on: so much so it took over two hours for me to get it working without breaking other cases. "Don't patch bad code, rewrite it" said Kernighan and Plauger. So I spent my afternoon reworking the chain topology startup code to something I wasn't embarrassed by: it now simply and straightforwardly asks for the three pieces of information it needs from bitcoind at startup, then performs initialization. Then later it starts the various polling loops (for fees, and for new blocks). Adding code to delay is now trivial. And the next person who touches this code will have a much more pleasant time without all the scar tissue which was left from previous incremental changes.
#dev #cln Shifted gears today: there's a bug that Christian Decker brought to my attention. There's a Greenlight node where onchaind hasn't finished spending a HTLC tx, which shouldn't happen. Weirdly, I have one of these on my node, but that didn't hugely surprise me as I have done experimental upgrades in the past and edited the db by hand on at least one occasion 😬 But now I know it's real, time to find out why. Since Greenlight restarts nodes all the time, it's pretty clearly a failure across restarts. This code used to be trivial: on restart we would start processing the chain back at the earliest outstanding channel close. This was the exact same as if it was happening the first time, only faster. But replaying up to 100 blocks at startup is slow, so instead we recorded events on the DB and replay from there on startup. And I never really liked that code, but it *was* much faster. But I would rather be simple than fast (as long as we're not preventing the rest of startup completing), and since there's a bug, it's a chance to simplify. We *do* need to save the tx which closes the channel, and that's simple and fine. But then on restart we can simply walk through blocks starting with the one containing the funding spend, and tell onchaind about all spends. This is how onchaind works: we follow all txs which spend the channel output, and their children, and as an optimization it tells us to ignore txs it doesn't care about. This didn't actually take that long to code, and as a nice side effect it will even work on pruned nodes, as in the latest release we now ask bitcoind to refetch old blocks if they're pruned. I'll report what happens when I actually run it on my node tomorrow: onchaind does spam my debug logs with a huge number of messages right now, and it has been closed for 120,000 blocks!
Today, I got back to working on "askrene" for Core Lighting. The idea is to extract the core logic of the experimental "renepay" plug-in into an Oracle you can ask for routes (and provide feedback on what happened). This is much more composable: anyone can write a plugin which uses this information, *or* an alternate pathfinding plugin to replace it. It's built around the idea of "layers" which are where information lives: you tell it what layers to use when you ask for a set of routes. This has many uses: you might have route hints or blinded paths you want (or have to!) use. You might want to constrain a payment to a particular channel for rebalancing, etc. In theory, these layers can be exported and imported: you can share information about the state of the network between nodes. I'm sure there's a pile of obfuscation needed to preserve privacy in this case, but it's an interesting idea...
Wow, nostr is giving very "welcome to Living, now configure your modeline" 90s vibes. Lots of clients, most abandoned, very easy to "choose wrong" and not be able to do things like NIP-46. And yes, you're going to have to learn what that is :)