More suitable evaluation metric

As we can see on the starter notebook, level 1 part of the problem is not really a challenge as the building stock type seems easily predictable. So the real challenge is on the level 2. Correct me if i’m wrong, but on the level 2, there are 12 targets where F1 doesn’t really make sense and treating those as regression problem would be a lot better in terms of reflecting the true performance of our solutions.

I’m talking about the following ones:
‘in.number_of_stories_com’,
‘in.vintage_com’,
‘in.weekday_opening_time…hr_com’,
‘in.weekday_operating_hours…hr_com’,
‘in.bedrooms_res’,
‘in.cooling_setpoint_res’,
‘in.heating_setpoint_res’,
‘in.geometry_floor_area_res’,
‘in.income_res’,
‘in.vintage_res’,
‘in.tstat_clg_sp_f…f_com’,
‘in.tstat_htg_sp_f…f_com’

1 Like

Hi @clkrv1,

Firstly, accurately classifying building stock types is crucial as it forms the foundation for subsequent predictions. If the initial classification is incorrect, all subsequent predictions will be wrong and discarded. Therefore, achieving accurate level-1 predictions is heavily weighted.

Secondly, if selected among the top scorers on the leaderboard, your classification model will be evaluated on a holdout dataset which may present greater level-1 classification challenges compared to the training dataset.

Regarding your second question about regression, we chose to focus on classification due to various factors, such as the limited discrete values that the target variables can assume. However, you have the flexibility to employ any method, including regression, to address the classification problem outlined in the challenge.

Best of luck Classifying the Buildings!
ThinkOnward Team