RuntimeError when multiplying tensors a and b at non-singleton dimension 4. Error caused while calculating loss.

RuntimeError when multiplying tensors a and b at non-singleton dimension 4. Error caused while calculating loss.  

  By: shirshak.acharya on Aug. 14, 2024, 4:25 p.m.

While trying to train the FracSegNet model which is given on the github : https://github.com/YzzLiu/FracSegNet/issues/6

I get following overall error message as :

epoch:  0
Traceback (most recent call last):
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/bin/nnUNet_train", line 8, in <module>
    sys.exit(main())
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/nnunet/run/run_training.py", line 164, in main
    trainer.run_training()
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/nnunet/training/network_training/nnUNetTrainerV2.py", line 431, in run_training
    ret = super().run_training()
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/nnunet/training/network_training/nnUNetTrainer.py", line 306, in run_training
    super(nnUNetTrainer, self).run_training()
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/nnunet/training/network_training/network_trainer.py", line 447, in run_training
    l = self.run_iteration(self.tr_gen, True)
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/nnunet/training/network_training/nnUNetTrainerV2.py", line 237, in run_iteration
    l = self.loss(output, target, disMap, self.epoch)
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/nnunet/training/loss_functions/deep_supervision.py", line 29, in forward
    l = weights[0] * self.loss(x[0], y[0],disMap[0],epoch)
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/nnunet/training/loss_functions/dice_loss.py", line 373, in forward
    dc_loss = self.dc(net_output, target, disMap_weight, loss_mask=mask, current_epoch = epoch) if self.weight_dice != 0 else 0
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/nnunet/training/loss_functions/dice_loss.py", line 198, in forward
    tp, fp, fn, _ = get_tp_fp_fn_tn(x, y,disMap,axes, loss_mask, False,current_epoch)
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/nnunet/training/loss_functions/dice_loss.py", line 150, in get_tp_fp_fn_tn
    tp = torch.mul(tp, disMap2onehot)
RuntimeError: The size of tensor a (112) must match the size of tensor b (221) at non-singleton dimension 4
Exception in thread Thread-4 (results_loop):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 92, in results_loop
    raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the print"
RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message
Exception in thread Thread-5 (results_loop):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/mnt/Enterprise2/shirshak/NewFracSegNet/FracSegNet/.venv/lib/python3.10/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 92, in results_loop
    raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the print"
RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message

I think the problem might be solved by someone here.... What might be the problem here? Anyone here to help! I really need to train this FracSegNet model and make submission fast....

Re: RuntimeError when multiplying tensors a and b at non-singleton dimension 4. Error caused while calculating loss.  

  By: shirshak.acharya on Aug. 14, 2024, 4:26 p.m.

https://github.com/YzzLiu/FracSegNet/issues/6 The same error has been looked at the above link! But it didnot solve the problem in my case! If anyone can help?