Skip to content

Conversation

gmierz
Copy link
Collaborator

@gmierz gmierz commented Oct 9, 2025

This set of patches adds a new alert management system for telemetry alerts. The commits below attempt to split up the system into some logical chunks with newer commits building on previous ones.

Some generic base and utility classes are added directly to the auto_perf_sheriffing folder. These are not specific to telemetry alerting and could be used in other performance sheriffing automation.

The concrete classes for telemetry alert management are found in the treeherder/perf/auto_perf_sheriffing folder. These are then integrated into the telemetry detection code in Sherlock through the TelemetryAlertManager and run from TelemetryAlertManager.manage_alerts.

The manage_alerts method is defined generically in the AlertManager class. It starts by updating the DB with any changes made in telemetry bugs in Bugzilla - this is only for their resolutions at the moment. After this, bugs are filed for the alerts that are generated for any probes that specify a bug should be filed (by setting the monitor.alert field to True in their probe definition). Once bugs are filed, modifications are made to these bugs and any existing bugs as needed. This currently only modifies the see_also field to associate all bugs filed for the same detection range together - in other words, all the bugs that are part of the same PerformanceTelemetryAlertSummary. At the end of this "bug handling" phase, emails are produced for any alerts that request it (either bugs are produced or emails, but never both to reduce spamming). Finally, it's possible that either the bug modifications or emails fail. In that case, we have a "house keeping" stage where we do retries of the failed alerts on a daily basis.

For treeherder-admins, the relevant changes will be in the first commit where I am adding a new env field to capture the BUG_COMMENTER_API_KEY being set locally. This is needed for testing the bug modification aspect of the management system.

@gmierz gmierz force-pushed the telemetry-alert-manager-comp branch from 44fab00 to 0e26ca1 Compare October 9, 2025 12:51
@gmierz gmierz requested a review from Andrej1198 October 9, 2025 15:43
@gmierz
Copy link
Collaborator Author

gmierz commented Oct 17, 2025

Here's a sample bug that is filed by this: https://bugzilla.mozilla.org/show_bug.cgi?id=1993145

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant