-
Notifications
You must be signed in to change notification settings - Fork 58
PlanningReport: first pass at operator notes #9116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
These should be (a) serialized in the db along with the blueprint they came from and (b) be available in the external API, but I wanted to get a first pass out so folks could look over the way I've pulled these out. I'm not at all sure this is the best way to generate these strings, and even if it is, they undoubtedly need some serious wordsmithing for external consumption and consistency.
|
||
let operator_notes = report.operator_notes().into_notes(); | ||
if !operator_notes.is_empty() { | ||
println!("\nnotes for customer operator:"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tiniest of nits; you think we could just say "notes for operator", here and below? "customer operator" feels like an unnecessary qualifier
"{nsleds_error_manifest} sled{} have unexpected errors \ | ||
on their boot disks", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about:
"{nsleds_error_manifest} sled{} have unexpected errors \ | |
on their boot disks", | |
"{nsleds_error_manifest} sled{} have unexpected errors \ | |
and are unable to access their stored configuration", |
(It happens to be stored on the boot disk, but the problem is that we can't access the configuration for zones when we want it)
|
||
#[derive(Debug, Clone)] | ||
pub struct PlanningOperatorNotes { | ||
notes: Vec<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing a few types of notes falling out from this initial construction:
- "This should never happen" (e.g., orphaned disks, zombie sleds)
- "This could happen, and is blocking planning from proceeding" (e.g., no target set, we don't have a disk to place a necessary NTP service, etc)
- "This could happen, and should be transient, but may be an issue if it's recurring" (e.g., something not seen in inventory)
- "This is informational, and may or may not be expected behavior". (e.g., converting sleds to known versions)
I wonder if it's worth pulling these categories apart to highlight certain higher-value messages (e.g., "you're watching update status, but never set a target!") from lower-value messages (e.g., "we did work, and things progressed as expected").
Don't have to do this yet, as we're only exposing this info to omdb, but this seems like a spot where we might add that "categorization" explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need categories like those in order to surface only the problems in the external API (not sure we will wants to include the info notes).
} | ||
if npools > 0 { | ||
self.notes.push(format!( | ||
"{npools} disk{} are unavailable across {nsleds} sled{}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"{npools} disk{} are unavailable across {nsleds} sled{}", | |
"{npools} disk{} are missing storage service across {nsleds} sled{}", |
We chatted about this briefly during today's update watercooler (recorded). Some thoughts and issues from that discussion:
This PR only adds an in-memory version of the notes, and it attempts to mostly not emit things that are expected during updates. I'm sure I missed some in both directions, but it's a starting point. A proposal is to bikeshed this PR a bit until we're happy enough, knowing that as of this PR, these notes are only visible via
omdb
. If we can land this and start using it during updates, we might get some idea of whether these are worth surfacing to the external API at all in the R17 time frame, or whether we want to get more update experience first. If we do want to surface these, we'll need to serialize them as part of the blueprint, add them to the external API, and maybe add a way to see the notes from the planner task and not just the most recent target blueprint. All of those should be straightforward.