Skip to content

Adding a new nnUNetTrainer variant for extensive data augmentations on GPU #2911

@NathanMolinier

Description

@NathanMolinier

Description

Hello nnUNet Community,

I am currently working on a project exploring how data augmentations can improve model performance for 3D segmentation tasks on MRI images, using nnUNet as a baseline for my experiments.

As part of this effort, I introduced a new trainer called nnUNetTrainerDAExt (shared in another issue, and potentially a future PR), which incorporates a broader range of augmentations and has shown promising improvements during inference.

However, the main drawback of this trainer—as with other augmentation-based trainers in the variants folder—is that augmentations are performed on the CPU. This creates a significant bottleneck, especially when combining multiple augmentations, while the GPUs remain underutilized. For instance, with nnUNetTrainerDAExt, depending on the number of CPU cores available (since transforms use multiprocessing), one epoch can take anywhere between 700 seconds and over 3000 seconds—leading to training times that can stretch beyond a month.

To address this issue, I developed a new GPU-based augmentation trainer. In my tests, this trainer achieved a 3× reduction in epoch time compared to the CPU-based version (from ~700s down to ~230s on my cluster).

At the moment, this trainer doesn’t yet include as many augmentation options as nnUNetTrainerDAExt, but I am actively working on expanding it—either by implementing new transforms or by integrating existing ones from libraries such as Kornia or torchvision. Admittedly, finding 3D transforms is more challenging than for 2D.

If anyone is interested in contributing to the development of this trainer, your help would be greatly appreciated!

Related issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions