-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Enable quiescence in production and add timeout config #10001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enables the quiescence protocol to run in non-development environments, replacing the old fixed timeout with a configurable value. Key changes include passing the new timeout config throughout peer connection, updating the configuration and validation for the htlcswitch, and revising integration tests to verify the timeout behavior.
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
server.go | Added QuiescenceTimeout propagation in peerConnected configuration. |
sample-lnd.conf | Documented the new config option with usage guidance. |
peer/brontide.go | Updated Config to include the QuiescenceTimeout value in link creation. |
lncfg/protocol.go | Adjusted NoQuiescence() to enable quiescence in production. |
lncfg/htlcswitch.go | Introduced new timeout constants, config field, and validation logic. |
itest/lnd_quiescence_test.go | Expanded integration tests to assert timeout behavior and node reconnection. |
htlcswitch/quiescer.go | Updated error message formatting and removed unused timeout constant. |
htlcswitch/link.go | Updated ChannelLinkConfig and link initialization to use the new timeout. |
docs/release-notes/release-notes-0.20.0.md | Added release note documentation for the new config. |
config.go | Set default QuiescenceTimeout value in the overall config. |
Important Review skippedAuto reviews are limited to specific labels. 🏷️ Labels to auto review (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
4951959
to
4240edd
Compare
|
||
// DefaultQuiescenceTimeout specifies the default value to be used for | ||
// `QuiescenceTimeout`. | ||
DefaultQuiescenceTimeout = 60 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the current behavior if this is hit? Do we drop the connection, then try to reconnect again? Or is there some messages that both sides send to open things up again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the link we will simply disconnect. then due to the connection is persistent as we have a channel, both sides will try to reconnect.
@@ -147,7 +147,7 @@ func (l *ProtocolOptions) NoExperimentalEndorsement() bool { | |||
|
|||
// NoQuiescence returns true if quiescence is disabled. | |||
func (l *ProtocolOptions) NoQuiescence() bool { | |||
return true | |||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to enable this for non-dev envs if nothing user facing uses it yet?
Also where do we even check this value? IIUC there's no standalone feature bit. The RPC call doesn't check the value:
lnd/lnrpc/devrpc/dev_server.go
Lines 356 to 385 in 7be45c3
// Quiesce initiates the quiescence process for the channel with the given | |
// channel ID. This method will block until the channel is fully quiesced. | |
func (s *Server) Quiesce(_ context.Context, in *QuiescenceRequest) ( | |
*QuiescenceResponse, error) { | |
txid, err := lnrpc.GetChanPointFundingTxid(in.ChanId) | |
if err != nil { | |
return nil, err | |
} | |
op := wire.NewOutPoint(txid, in.ChanId.OutputIndex) | |
cid := lnwire.NewChanIDFromOutPoint(*op) | |
ln, err := s.cfg.Switch.GetLink(cid) | |
if err != nil { | |
return nil, err | |
} | |
select { | |
case result := <-ln.InitStfu(): | |
mkResp := func(b lntypes.ChannelParty) *QuiescenceResponse { | |
return &QuiescenceResponse{Initiator: b.IsLocal()} | |
} | |
return fn.MapOk(mkResp)(result).Unpack() | |
case <-s.quit: | |
return nil, fmt.Errorf("server shutting down") | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we have this config,
Lines 1450 to 1451 in 7be45c3
DisallowQuiescence: p.cfg.DisallowQuiescence || | |
!p.remoteFeatures.HasFeature(lnwire.QuiescenceOptional), |
and the DisallowQuiescence
is set here,
Line 4531 in 7be45c3
DisallowQuiescence: s.cfg.ProtocolOptions.NoQuiescence(), |
Do we need to enable this for non-dev envs if nothing user facing uses it yet?
Can also update it once dyn is in place tho, was just preparing for dyn.
itest/lnd_quiescence_test.go
Outdated
// | ||
// TODO(yy): When refactoring the peer connection we need to have an RPC | ||
// that reports how many times the peer has been connected. | ||
ht.AssertConnected(alice, bob) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can this this call here? https://lightning.engineering/api-docs/api/lnd/lightning/subscribe-peer-events/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh nice didn't know that, yeah updated!
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request enables the quiescence protocol in production environments and introduces a new configuration option, --htlcswitch.quiescencetimeout
, to control the maximum duration a channel can be quiescent. The changes are well-implemented, propagating the new configuration value from the top-level config down to the channel link where it's used. The integration test has been significantly improved to cover the new timeout behavior, including node restarts and disconnects due to timeouts. The documentation, including release notes and the sample configuration file, has been updated accordingly. The code is clean and the changes are consistent with the PR's objective. I have one minor suggestion for improving the grammar in the release notes for better clarity.
4240edd
to
2e62d32
Compare
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request enables the quiescence protocol in production environments and introduces a configurable timeout for it. The changes are well-implemented across the configuration, peer management, and HTLC switch logic. The integration test has been significantly improved to cover the new timeout functionality, including peer disconnection and reconnection scenarios.
I've found a couple of minor documentation issues in comments that could be improved for clarity. Other than that, the changes look solid and well-tested.
This commit makes removes the `defaultQuiescenceTimeout` and makes it configurable as different nodes have different network environment. In addition the default timeout has been increased from 30s to 60s.
The original test lets Bob send an HTLC to Alice, but Bob doesn't have any balance to begin with. This commit now fixes it by explicitly checking sending payment is allowed before quiescence, and forbidden after.
2e62d32
to
b17d804
Compare
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request enables the quiescence protocol for production use and introduces a configurable timeout. The changes are well-implemented across the configuration, peer management, and htlcswitch components. The integration test has been substantially improved to validate the new timeout behavior and various quiescence scenarios. I've left a few minor comments regarding documentation and test code clarity. Overall, this is a solid contribution.
And a few assertion helpers.
b17d804
to
3034803
Compare
3034803
to
ef7d13b
Compare
This PR enables the quiescence protocol to run in non-dev env. A new timeout config value is introduced to replace the old default value, which controls how long the channel can be quiescent. In addition, the itest is fixed and updated to check the timeout behavior.