-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Problem Description
The sweeper has a race condition where inputs are grouped and broadcast before their historical rescan completes, causing entire input sets to be marked as Fatal when only one input was already spent.
When sweeping inputs with old heightHint values, the chain notifier performs a historical rescan to check if the input has already been spent. However, the sweeper does not wait for this rescan to complete before:
- Grouping the input with other inputs in BudgetAggregator.ClusterInputs
- Broadcasting the sweep transaction
If the historical rescan eventually discovers that one input was already spent, the broadcast fails and all inputs in the group are marked as Fatal, preventing them from ever being swept again.
Code Analysis
Root Cause
- Location: monitorSpend function in sweep/sweeper.go
- Issue: RegisterSpendNtfn starts asynchronous historical rescan but returns immediately
- Impact: Input becomes available for grouping before spend status is known
Affected Components
- Input Grouping: ClusterInputs method in sweep/aggregator.go - groups inputs without waiting for rescan completion
- Error Handling: markInputsFatal function in sweep/fee_bumper.go - marks entire input set as Fatal on broadcast failure
- Immediate Sweeping: Input processing in collector loop - can trigger sweeping before rescan completes
Error Propagation Flow
Old input with heightHint → RegisterSpendNtfn (starts rescan) →
Input grouped immediately → Broadcast fails (input already spent) →
Entire input set marked Fatal → Valid inputs never retried
Proposed Solutions
Solution A: Wait for Historical Rescan
Solution B: Better Error Attribution
- Modify handleMissingInputs function to identify which specific input caused the failure
- Only mark the problematic input as Fatal, retry others