-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Fix BackupRepositories becoming stale when BSL config changes while Velero is not running #9236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #9236 +/- ##
==========================================
+ Coverage 59.64% 59.75% +0.11%
==========================================
Files 382 382
Lines 43960 44091 +131
==========================================
+ Hits 26218 26346 +128
Misses 16195 16195
- Partials 1547 1550 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Community Meeting: check existing invalidation logic if it can be reused instead of adding more code.
a6b170b
to
5a40e17
Compare
29fe78d
to
196bc7b
Compare
…elero is not running This change validates BackupRepository configurations against their associated BackupStorageLocation on controller startup. If BSL configuration (bucket, prefix, CACert, or config) has changed while Velero was not running, the affected repositories are invalidated and will be re-established. Key changes: - Add startup validation that checks all BackupRepositories against current BSL configs - Store BSL configuration in BackupRepository annotations for comparison on startup - Add shared compareBSLConfigs function to eliminate code duplication - Move BSL annotation constants to labels_annotations.go for consistency - Add comprehensive test coverage for startup validation logic Fixes vmware-tanzu#8279 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
…elero is not running This change validates BackupRepository configurations against their associated BackupStorageLocation on controller startup. If BSL configuration (bucket, prefix, CACert, or config) has changed while Velero was not running, the affected repositories are invalidated and will be re-established. Key changes: - Add startup validation that checks all BackupRepositories against current BSL configs - Store BSL configuration in BackupRepository annotations for comparison on startup - Add shared compareBSLConfigs function to eliminate code duplication - Move BSL annotation constants to labels_annotations.go for consistency - Add comprehensive test coverage for startup validation logic Fixes vmware-tanzu#8279 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
This commit adds an E2E test to verify that backup repositories are properly validated against BSL configuration changes when Velero restarts. The test simulates a scenario where BSL configuration changes while Velero is not running and verifies that repositories are invalidated on startup with the correct error message. Test scenario: 1. Creates a backup to establish a BackupRepository 2. Scales down Velero deployment (simulating shutdown) 3. Modifies BSL configuration (changes prefix) 4. Scales up Velero deployment (simulating startup) 5. Verifies repository is invalidated with correct message 6. Restores original BSL configuration 7. Verifies repository recovers to Ready state Changes: - Added new E2E test file: test/e2e/bsl-mgmt/startup_validation.go - Registered test in test/e2e/e2e_suite_test.go - Added test label to GitHub workflow matrix for CI execution 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Use pod status checks instead of fixed delays
Summary
This PR fixes issue #8279 where BackupRepositories become stale when BackupStorageLocation (BSL) configuration is updated or created while Velero is not running.
Problem
When a BSL is updated/created while Velero is not running, existing BackupRepositories that reference the BSL become stale and continue using the old configuration. This prevents successful backups/restores until the repositories are manually deleted.
Solution
The implementation validates BackupRepository configurations against their associated BackupStorageLocation on controller startup. If BSL configuration (bucket, prefix, CACert, or config) has changed while Velero was not running, the affected repositories are invalidated and will be re-established.
Implementation Details
Core Changes:
Startup Validation (
validateBackupRepositoriesOnStartup
):Configuration Tracking:
velero.io/bsl-bucket
,velero.io/bsl-prefix
,velero.io/bsl-cacert-hash
,velero.io/bsl-config
Shared Comparison Logic (
compareBSLConfigs
):needInvalidBackupRepoOnStartup
andneedInvalidBackupRepo
Thread Safety:
Testing
Unit Tests Added:
TestValidateBackupRepositoriesOnStartup
: Tests the startup validation logic with various scenariosTestNeedInvalidBackupRepoOnStartup
: Tests the comparison logic for startup validationE2E Test Added:
test/e2e/bsl-mgmt/startup_validation.go
Files Changed:
pkg/controller/backup_repository_controller.go
: Core implementationpkg/controller/backup_repository_controller_test.go
: Unit testspkg/apis/velero/v1/labels_annotations.go
: BSL annotation constantstest/e2e/bsl-mgmt/startup_validation.go
: E2E testtest/e2e/e2e_suite_test.go
: Test registration.github/workflows/e2e-test-kind.yaml
: CI test matrixchangelogs/unreleased/8279-kaovilai
: Changelog entryFixes
Fixes #8279
Test plan
Note
Responses generated with Claude