Skip to content

Conversation

karthikvetrivel
Copy link
Member

Usage

Add a /cherry-pick comment to your PR specifying target release branches:

/cherry-pick release-25.3
/cherry-pick release-24.9

When to add the comment:

  • Before merge: Bot acknowledges and automatically backports after PR is merged
  • After merge: Bot immediately starts backporting

What Happens

  1. ✅ Bot creates a new PR for each target branch
  2. 🏷️ Applies labels: backport + auto-backport (clean) or needs-manual-resolution (conflicts)
  3. 💬 Posts a comment with link to each backport PR

Example

Example of a PR where you specify the target release branch:

unknown

Example of new PR opened:

Screenshot 2025-10-08 at 1 46 32 PM

Conflict Handling

If cherry-pick has conflicts:

  • PR is created in draft mode with needs-manual-resolution label
  • Follow instructions in the backport PR description to resolve conflicts

Branch Naming

Release branches must follow the pattern: release-X.Y

Examples: release-25.3, release-24.9.1, release-23.9

Example

Original PR #1234 merged to main
  ↓
/cherry-pick release-25.3 comment
  ↓
Bot creates PR #1235: [release-25.3] Original PR title

…nches

Signed-off-by: Karthik Vetrivel <kvetrivel@nvidia.com>
Copy link
Contributor

@cdesiniotis cdesiniotis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a quick first pass. This is a great start!

Signed-off-by: Karthik Vetrivel <kvetrivel@nvidia.com>
Copy link
Collaborator

@ArangoGutierrez ArangoGutierrez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome initiative! left some comments

Signed-off-by: Karthik Vetrivel <kvetrivel@nvidia.com>
@karthikvetrivel karthikvetrivel force-pushed the ci/integrate-cherry-pick-bot branch from 32320fc to e1ae960 Compare October 8, 2025 20:45
@karthikvetrivel
Copy link
Member Author

Screenshot 2025-10-08 at 4 53 41 PM

@tariq1890 Updated the backport bot to cherry-pick individual commits instead of merge commits. Tested with a 2-commit PR to verify it handles multiple commits correctly.

Copy link
Collaborator

@ArangoGutierrez ArangoGutierrez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a JS expert, just a few comments from what I can remember from my early days with JS

Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @karthikvetrivel. This is great.

Is this something that we want to publish as an action at some point, or what options do we have to share this to other repos we own?

Maybe as a follow up: What does the landscape of available actions for doing this look like?

@ArangoGutierrez
Copy link
Collaborator

Thanks @karthikvetrivel. This is great.

Is this something that we want to publish as an action at some point, or what options do we have to share this to other repos we own?

++ To this, since @karthikvetrivel implementation is already in JS, we could have this JS code at k8s-test-infra and from there export it as a GitHub Action so we can reuse in all our repos

@karthikvetrivel
Copy link
Member Author

@ArangoGutierrez @elezar thanks for the review! Regarding rollout, I see a few options.

  1. As @ArangoGutierrez mentioned, we can host as reusable workflow in k8s-test-infra. Other repos can call it with something like this:
jobs: 
  cherrypick:
    uses: NVIDIA/k8s-test-infra/.github/workflows/cherrypick.yml@main

These seems the simplest to me.

  1. Package as a composite action in k8s-test-infra. Repos would use it as a step:
  - uses: NVIDIA/k8s-test-infra/.github/actions/backport@main
    with:
      github-token: ${{ secrets.GITHUB_TOKEN }}

This gives repos more flexibility to add custom steps before/after, but our bot needs full job isolation for git operations (script will fail if initial starting state is incorrect), so option 1 seems better.

  1. Build as a full JavaScript action with @vercel/ncc and publish to GitHub Marketplace. This would only makes sense if we want to open-source it for the broader community as it adds to build/release complexity.

  2. Keep copy-paste approach (current state). Works but means updating multiple repos when we improve the logic.

It seems like option 1 is the best to me but would to leave hear your thoughts.

Signed-off-by: Karthik Vetrivel <kvetrivel@nvidia.com>
@karthikvetrivel karthikvetrivel force-pushed the ci/integrate-cherry-pick-bot branch from 7dbcd48 to 87e9703 Compare October 9, 2025 20:04
@ArangoGutierrez
Copy link
Collaborator

@ArangoGutierrez @elezar thanks for the review! Regarding rollout, I see a few options.

  1. As @ArangoGutierrez mentioned, we can host as reusable workflow in k8s-test-infra. Other repos can call it with something like this:
jobs: 
  cherrypick:
    uses: NVIDIA/k8s-test-infra/.github/workflows/cherrypick.yml@main

These seems the simplest to me.

  1. Package as a composite action in k8s-test-infra. Repos would use it as a step:
  - uses: NVIDIA/k8s-test-infra/.github/actions/backport@main
    with:
      github-token: ${{ secrets.GITHUB_TOKEN }}

This gives repos more flexibility to add custom steps before/after, but our bot needs full job isolation for git operations (script will fail if initial starting state is incorrect), so option 1 seems better.

  1. Build as a full JavaScript action with @vercel/ncc and publish to GitHub Marketplace. This would only makes sense if we want to open-source it for the broader community as it adds to build/release complexity.
  2. Keep copy-paste approach (current state). Works but means updating multiple repos when we improve the logic.

It seems like option 1 is the best to me but would to leave hear your thoughts.

As you recommend option 1, lets go with that.

I think we can finish the review here and once we all agree with the implementation, we can talk about a centralized GitHub action so other repos can benefit from this tool

@karthikvetrivel
Copy link
Member Author

@ArangoGutierrez Thanks!

@elezar @cdesiniotis @tariq1890 what do you think about the current plan? As next steps, we can:

  1. Close this PR w/o merging once everyone is satisfied with the outlined functionality
  2. Centralize backport workflow as a reusable component in k8s-test-infra & discuss target repos we'd like to have the action
  3. Create PRs in target repos for initiating the action.

Copy link

@rahulait rahulait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the PR looks good to me.

Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @karthikvetrivel. I haven't given this an in-depth review.

I think a more pragmatic approach w.r.t how to move forward would be to merge this to the GPU Operator repo and collect experience with how this behaves before we try to centralize the functionality. This will give us a tighter development cycle and allow us to iron out any initial issues before making this more generally available (should we choose to do so).

One question that has just popped into my head is whether this command checks for properties of the user issuing the command at all?

@karthikvetrivel
Copy link
Member Author

@elezar That sounds good to me re: merging this into GPU Operator and seeing what we'd like to change/keep.

This command does not check for properties of the user issuing the command. I originally thought about checking membership (to NVIDIA org) but that is difficult because not everyone sets the organization's membership to public. I had a chat with @rahulait and it doesn't seem strictly necessary for a security perspective.

Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's get this in an collect data.

@karthikvetrivel karthikvetrivel merged commit 0428e9c into NVIDIA:main Oct 21, 2025
16 checks passed
Copy link

🤖 Backport PR created for release-25.3: #1803

Copy link

🤖 Backport PR created for release-24.9: #1804

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants