Can you provide the test set of the qualification phase after the competition finishes? ¶
By: 王呆鹅 on March 30, 2022, 12:55 p.m.
We meet a problem. The train set was randomly splited into 3 parts, 1200 for train, 400 for validation and 400 for test. We trained our model under different hyper parameters. We saved the best model when it reached the best validation auc. Then we used the best model to predict the test set and got the test auc. The results are shown here:
trial 1: auc_val = 0.8178, auc_test = 0.8179 tiral 2: auc_val = 0.8026, auc_test = 0.8144 trial 3: auc_val = 0.8189, auc_test = 0.8105 trial 4: auc_val = 0.8084, auc_test = 0.8079 trial 5: auc_val = 0.8118, auc_test = 0.8193
After that, we did 5 folds cross validation. Each fold had 1600 images for train and 400 for validation. The validation auc of 5 folds were 0.7913, 0.7819, 0.8306, 0.8284, 0.8689, respectively. The average auc was 0.8202.
When we submited the algorithm, we just calculated the average value of the 5 predicitons . But the auc of the test set was only 0.75.
According to the previous 5 trials, the validation auc and the test auc are very close. And model ensemble usually gives a better result than one model. So we don't understand what happend. Does the test set have the same distribution as the train set? The age, gender and spacing are correct in test set? Can you provide the test set of the qualification phase after the competition finishes? So we can study what happend in the test set.