Skip to content

Issue with training and inference #1

@SkAdilina

Description

@SkAdilina

Hi,

First of all, Thank you so much for uploading the longitudinal pipeline of nnUNet.

I trained my dataset (for all the folds) with the default trainer using command:

  • "LongiSeg_train 900 3d_fullres 0 --npz".
  • But it seems to have trained the nnUNetTrainerLongi because the folder in the results directory says "nnUNetTrainerLongi__nnUNetPlans__3d_fullres".
  • Should the default trainer be LongiSegTrainer?

When I am trying to do the inference using the default command:

  • LongiSeg_predict -i MY_INPUT_FOLDER -o MY_OUTPUT_FOLDER -pat PATH_TO_patients.json -d 900 -f 0
  • I get ERROR message --> FileNotFoundError: [Errno 2] No such file or directory: 'HIDING_ABSOLUTE_PATH/LongiSeg_results/Dataset900_FSNifti/LongiSegTrainer__nnUNetPlans__3d_fullres/dataset.json'
  • (It seems to be trying to access the "LongiSegTrainer__nnUNetPlans__3d_fullres" folder that I don't have)

When I specify the trainer for infernce:

  • LongiSeg_predict -tr nnUNetTrainerLongi -i MY_INPUT_FOLDER -o MY_OUTPUT_FOLDER -pat PATH_TO_patients.json -d 900 -f 0
  • I get ERROR message -->
    <class 'longiseg.training.LongiSegTrainer.variants.longitudinal.nnUNetTrainerLongi.nnUNetTrainerLongi'>
    Traceback (most recent call last):
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/longit-nnunet-venv/bin/LongiSeg_predict", line 8, in
    sys.exit(predict_longi_entry_point())
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/LongiSeg/longiseg/inference/predict_from_raw_data_longi.py", line 410, in predict_longi_entry_point
    predictor.initialize_from_trained_model_folder(
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/LongiSeg/longiseg/inference/predict_from_raw_data_longi.py", line 64, in initialize_from_trained_model_folder
    trainer_class.architecture_class_name,
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AttributeError: type object 'nnUNetTrainerLongi' has no attribute 'architecture_class_name'

Also when I tried to specify the trainer for training:

  • LongiSeg_train 900 3d_fullres -tr LongiSegTrainer 0 --npz
  • I get ERROR message -->
    _2025-04-07 14:58:00.617861: Using torch.compile...
    2025-04-07 14:58:05.606939: do_dummy_2d_data_aug: False
    2025-04-07 14:58:05.730882: Using splits from existing split file: HIDING_ABSOLUTE_PATH/LongiSeg_preprocessed/Dataset900_FSNifti/splits_final.json
    2025-04-07 14:58:05.793840: The split file contains 5 splits.
    2025-04-07 14:58:05.794511: Desired fold for training: 0
    2025-04-07 14:58:05.795130: This split has 2157 training and 803 validation cases.
    Traceback (most recent call last):
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/longit-nnunet-venv/lib/python3.12/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 53, in producer
    item = next(data_loader)
    ^^^^^^^^^^^^^^^^^
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/longit-nnunet-venv/lib/python3.12/site-packages/batchgenerators/dataloading/data_loader.py", line 126, in next
    return self.generate_train_batch()
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/LongiSeg/longiseg/training/dataloading/longi_data_loader.py", line 203, in generate_train_batch
    data_prior_all[j] = np.pad(data_prior, padding, 'constant', constant_values=0)
    ~~~~~~~~~~~~~~^^^
    ValueError: could not broadcast input array from shape (1,191,256,219) into shape (1,191,257,219)
    Exception in background worker 5:
    could not broadcast input array from shape (1,191,256,219) into shape (1,191,257,219)
    using pin_memory on device 0
    Traceback (most recent call last):
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/longit-nnunet-venv/bin/LongiSeg_train", line 8, in
    sys.exit(run_training_longi_entry())
    ^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/LongiSeg/longiseg/run/run_training.py", line 325, in run_training_longi_entry
    run_training(args.dataset_name_or_id, args.configuration, args.fold, args.tr, args.p, args.pretrained_weights,
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/LongiSeg/longiseg/run/run_training.py", line 207, in run_training
    nnunet_trainer.run_training()
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/LongiSeg/longiseg/training/LongiSegTrainer/nnUNetTrainer.py", line 1330, in run_training
    self.on_train_start()
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/LongiSeg/longiseg/training/LongiSegTrainer/LongiSegTrainer.py", line 485, in on_train_start
    super().on_train_start()
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/LongiSeg/longiseg/training/LongiSegTrainer/nnUNetTrainer.py", line 867, in on_train_start
    self.dataloader_train, self.dataloader_val = self.get_dataloaders()
    ^^^^^^^^^^^^^^^^^^^^^^
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/LongiSeg/longiseg/training/LongiSegTrainer/LongiSegTrainer.py", line 315, in get_dataloaders
    _ = next(mt_gen_train)
    ^^^^^^^^^^^^^^^^^^
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/longit-nnunet-venv/lib/python3.12/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 196, in next
    item = self.__get_next_item()
    ^^^^^^^^^^^^^^^^^^^^^^
    File "HIDING_ABSOLUTE_PATH/nnunet-longit/longit-nnunet-venv/lib/python3.12/site-packages/batchgenerators/dataloading/nondet_multi_threaded_augmenter.py", line 181, in _get_next_item
    raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the "
    RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message

Could you please help to point out where I am doing the inference/training wrong?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions