Panoptica Metrics SIGTERM issues ¶
By: minanessiem on Aug. 14, 2024, 1:04 p.m.
Hello ISLES24 Team,
I have just started integrating your provided Panoptica-based script to add the official metrics into my training pipeline. I currently have it such that I cache all of my predictions and their associated labels and then pass these case by case to the metrics calculator that encapsulates the code snippets provided on the github repo.
I keep however facing an issue where I keep getting the following error:
[rank: 0] Received SIGTERM: 15
When I set verbose=True
, this gives me the following log:
Panoptic: Start Evaluation
-- Got SemanticPair, will approximate instances
-- Got UnmatchedInstancePair, will match instances
(here when it keeps hanging, I press ctrl+c and see the error again)
^C[rank: 0] Received SIGTERM: 15
I have tried multiple options, to have the metrics be calculated after each batch, which breaks the training pipeline mid-epoch, the other is to cache or multiprocessing Queue the epoch results to calculate them on_epoch_end, but this only pushes the can down the line.
This issue strikes me whether I run workers=0 or workers >1
Is there maybe something I dont understand about using the panoptica library?
Thank you in advance.