Numlabs Data Science Blog - Soft grade detector - analyzing bias in climbers' grading system

Introduction

Every sport discipline has a method to compare skills between athletes. Runners measure time on a lap or a marathon, cyclists use network apps such as Strava. When it comes to climbers they introduce some unique grading systems.

0-6 grading system

Old scales were used, when people tend to climb in mountains only and it was used for exploration, not as athletics sport

Modern grading system

Despite the chart above, it is still unclear if the route rated 7b is hard as other 7b. There are not any objective values, as time or distance to present route’s grade, rather subjective climber’s experience in comparison with other routes. Experiences depend on many factors such as height or weight. That’s why soft grade, meaning a route that should have been graded lower than actually is, is a hot issue among cimber’s society.

Therefore we come up with the idea to explore dataset, thanks to Kaggle, to seek for the routes that are rated somehow like 7b but should be treated elsewhere like 7c+ or 7a.

The Dataset

Kaggle introduces data from web service which contains 4.1 million posts from climbers ascent. They provide lots of data but for analysis we need only a few of them.

name	crag_id	grade_id	user_id	fra_routes
Maßarbeit	16600	55	1476	7b+
Zugabe	16600	55	1476	7b+
Fingerbeißer	16600	53	1476	7b
Ich Habs Wollen Wis	16600	55	1476	7b+
Halucinacna	0	55	1476	7b+
Nove koreni	27784	55	1476	7b+
Spanelske lzi	27784	53	1476	7b
Deprese	29360	55	1476	7b+
Gottes Vergessene Kinder	21971	49	1476	7a

data needed

name - name which route’s author have given after first ascent

crag_id - id of the place where there route is placed

user_id - id of the person who have climbed a route

A great problem of the dataset is the lack of route_id column. It leads us to the point where the pair (name, crag_id) is the best approximation of the candidate key of the route. But there could be some crags that contain two different routes with the same name. What’s more people tend to name the same routes in different convention. Some of them gives a route’s variation name or simply make a typo and the system will assign it as different route. We decided to use a pair (name, crag_id). It is appropriate for majority of examples given.

Seeking for a soft grade

We propose two different ways to find a route that has a high grade however, when in comes to ascending it is easy to be accomplished.

User’s local maximum

We explore user’s climbing progression function over the time and peaks in the chart. We focus on local maximum because it could be the route, for instance 8c, that a climber did while they were able to do significantly easier routes.

Chart contains of logs from one of the strongest climber

Routes mean and dominant grade comparison

Second way to look for soft grades is to analyse all grades people have proposed for a route. Climbers tend to propose the same difficulty as the first person to ascent has given, or just the same as in a guidebook. But when a route is much harder or easier than it is graded, they break up and propose more accurate rate. Dominant of route grade when compared to mean grade would show presented scenario.

Preprocessing

Dataset need to be cleaned up in order to expect useful results in analysis. We take a subset where name occur more than n times. It is crucial because central_tendency algorithm calculates mean and dominant which is useless on too small dataset. Then we drop all users that have posted less then m ascents for local_maximum function. It is essential to properly analyse climbers peaks in progression. Experimentally we’ve chosen minimum quantity of route’s logs: 500 and minimum routes logged by a user: 30. Lastly many unexisting routes are being deleted with name such as “don’t know the name” or “unnamed”.

Analysis

After filtering data we were given 10415 posts of 832 routes in a data frame. While evaluating local maximum finder we use a find_peaks function from scipy. It has attributes: height, threshold, distance, prominence, width, wlen, rel_height, plateau_size.They change the sensitivity of the algorithm in finding peaks. We don’t want the noise to be described as a peak, but on the other hand no significant peak should be omitted.

The most significant arguments are threshold, which is vertical distance to its neighbouring samples and prominence - the best described by its similarity to a peak prominence in mountains.

Central tendency has one argument - difference - between mean value and dominant. Climbers tend not to change original grade even if grade is slightly easier than it should be. That’s why difference argument is set very low.

Local maximum arguments value is being set by using a grid search to most accurately fit confusion matrix of predicted (local maximum) and actual (central tendency) value.

In our model results from central tendency algorithm are labeled as Actual Values and from local maximum as Predicted.

After experiments we have chosen a local_maximum case with greatest accuracy and precision. Why are these parameters most significant. Obviously accuracy describes how often are true values in analysis. We focused on precision because we want to minimize routes, which were predicted as soft_grade but in fact they are just casual ones. The arguments are being set to threshold = 6 and prominence = 6. Threshold = 9 is being omitted due to the fact system find too few true positives with threshold = 9.

Performance

With accuracy and precission set to 6 we are given confusion matrix:

	Real Positive	Real Negative
Predicted Positive	63	136
Predicted Negative	77	556

accuracy: 74%

precision: 32%

recall: 45%

f1: 37%

Confusion matrix demonstrates us how well algorithms have assigned the easiness of the route. Due to great number of routes being properly described there is big ratio of true negative.

Conclusion,

While having certainly high accuracy, why is the precision indicator so low? First of all, what hasn’t been said, climbers tend to ascend some routes as their projects, and they do it really often. It means that they choose one really hard route and try it numerous times. When they finally accomplish the route it is a peak in local_maximum finder and cannot be distinguished from soft grade ones. It is especially hard to conclude, because fact about projects require domain knowledge of climbers’ habits and is nearly impossible to deduce from data.

Soft grade detector - analyzing bias in climbers' grading system

Introduction

The Dataset

Seeking for a soft grade

User’s local maximum

Routes mean and dominant grade comparison

Preprocessing

Analysis

Performance

Conclusion,

Comments

More on our blog

Snowflake: A Comprehensive Analytical Platform and Cloud Database. How Snowflake Revolutionizes Data Management

Scaling MLOps. Efficient Management of Multiple Model Lifecycles Using Apache Airflow, MLflow, and Containerization

HomeLab. A Personal Computer Laboratory for Everyone