Skip to content

Conversation

AlexYinHan
Copy link
Contributor

What is the purpose of the change

This PR fixes FLINK-38336, by allowing ForSt statebackend to reuse the restored files in failover scenario.

Brief change log

  • Reorganize the local/remote path of ForSt into a ForStPathContainer
  • Determine whether we are in a failover scenario by comparing the DB path and the file paths stored in StateHandles
  • Use ReusableDataTransferStrategy if we are in a failover scenario

Verifying this change

This change added tests and can be verified as follows:

  • DataTransferStrategyTest#testBuildingStrategyAsExpected

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)

@flinkbot
Copy link
Collaborator

flinkbot commented Sep 25, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

.getSharedStateDirectory();
FsCheckpointStorageAccess fsCheckpointStorageAccess =
(FsCheckpointStorageAccess) env.getCheckpointStorageAccess();
remoteJobPath = fsCheckpointStorageAccess.getCheckpointsDirectory();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only share state from the shared directory, so what exactly the remoteJobPath means?

: new CopyDataTransferStrategy(forStFlinkFileSystem);
LOG.info(
"Build DataTransferStrategy for Restore: {}, forStFlinkFileSystem: {}, cpSharedFs:{}, recoveryClaimMode:{}",
"Build DataTransferStrategy for Restore: {}, forStFlinkFileSystem: {}, cpSharedFs:{}, isDbUnderSameJobPathFromRestore:{}, recoveryClaimMode:{}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest we make this log as debug level one, as well the one in buildForSnapshot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants