Question about final leaderboard

EuB · March 18, 2024, 12:08pm

Hello @discourse-admin!
Could you please explain how the final leaderboard will be constructed? Is it necessary to submit your solution notebook to appear on the final leaderboard?

discourse-admin · March 18, 2024, 5:22pm

Hi @EuB

To be ranked on the final leaderboard you will need to submit your final submission code. Final submission scoring will be based primarily on your model’s results on a hold out dataset, as well as the interpretability of your submitted Jupyter Notebook.

Onward Team

dulyanov · March 18, 2024, 5:58pm

Hi @discourse-admin, another question:

Is it ok if it’s a python script? Why necessarily Jupyter Notebook?
How will data on a hold-out dataset be scaled? As the public test data or as train data?

EuB · March 19, 2024, 8:09am

What do you mean by a hold out dataset? Is it the part of those 15 cubes used for inference?

discourse-admin · March 19, 2024, 1:05pm

@dulyanov, per the Rules on the Challenge page, solutions are required in a Jupyter Notebook. You of course may provide .py scripts with your model to accompany the Jupyter Notebook. As a reminder, 5% of the overall score pertains to interpretation. This includes documentation and markdown discussion. This part is critical as it helps our team reproduce and review you model and results.

@EuB The Onward team will use an additional holdout dataset that is not released to Challengers as a method to review model validity and performance. We do this for every challenge. The holdout data will be scaled the same as test data.

Onward Team

dulyanov · March 22, 2024, 2:53pm

Hi @discourse-admin ,

Just a bit of ranting – I just submitted my solution, but while preparing, I realized that it’s not very transparent what “5% of the overall score pertains to interpretation” actually means.

Depending on interpretation it can mean very different things. E.g. if it means that result = 95% * metric + 5% * metric * (score for documentation between 0 and 1), then it means that the final ranking depends very much on these 5% scores because the difference between 1st place and 3th place is approximately 5% of the metric. In other competitions, 5% could be the difference between 1st place and 10th place. At the same time, getting 5% better metric usually is a very very hard thing to do, it cannot be compared to “writing better docs”. And if all efforts can be crossed just by the fact that I e.g. did not comply with “reasonably following standard Python style guidelines”, it’s very sad. Knowing that documentation could affect the standings I spent a lot of time on it and it will be 10x sad if it is still not what you expect.

Overall, my sincere suggestion for future competitions is to make criteria and other details about competitions as formal as possible, as competitors really want to understand them. E.g. Kaggle makes it 100% clear how the submission is scored, what is the end date of the competition and so on.