Athlete Selection Research

Author

Talon Hird & Ethan Egert

Published

October 27, 2024

Athlete Selection

Findings

Selection: Collie

  • Collie demonstrates overall superior fitness and running performance

  • Boasts a significantly faster average pace over Hound

  • Collie trains at far higher intensity on average

  • Collie shows statistical significance of improvement in aerobic fitness over time

  • Collie is a more efficient runner and is less adverseley affected by inclines while also showing efficiency gains over longer distances

Abstract

In this document, we will be performing an investigation into who is the superior candidate out of the two that have made it into the final round of the selection process. The winner will be chosen to be trained and prepared for competition at next year’s Masters Cross-Country World Championships. The criteria for the selection can be best described as ‘fitness’, however a term this vague is inherently difficult to define. In our estimation, ‘fitness’ comprises of a runner’s speed, cardiovascular fitness, and ability to maintain high performance when confronted with difficult running conditions.

Tests

First, we will conduct a simple test to see which runner typically runs at a faster pace. This will provide us with a starting point and context for further analysis.

Runner Avg_Pace
Collie 5.23
Hound 6.53

From this cursory glance at average pace distributions, we can see that Collie has more than a minute of advantage for average pace (min/km). This should be not be taken at face value as other factors such as going on longer runs can force a runner to pace themselves more conservatively. The next chart will clarify that Collie’s faster pace is not a symptom of much shorter runs.

The data shows us that while he has more outliers, Hound does not run a significantly longer distance on average than Collie. This coupled with Collie’s meaningful speed advantage give Collie a leg up at the current stage of analysis. However, to be sure the speed difference really is statistically significant we will perform a hypothesis test at 95% confidence before moving on to other criteria.

Our null hypothesis will be that there is no statistically significant difference between the paces of the two runners. Therefore, if the t-test can reject this at 5% level of significance, we will be able to assert that we are 95% confident that our assertation from the previous test that Collie is faster is correct.

The output of this confidence test will be displayed below this text in the form of the percentage chance that our null hypothesis is true given the properties of the data.

[1] "0.00000000000000000000000000000000000000000000001947701"

This exceedingly small number is far below our required p statistic of 5% and therefore we can say without hesitation that Collie is the faster runner on average. Since we now know this with statistical certainty, we may move on to criteria other than pace. We will now look into the statistical properties of the runners’ training intensities. If one of the runners trains at much higher intensity than the other, this could affect which one is perceived as faster without necessarily equating to better aerobic fitness. A higher aerobic TE score indicates that the athlete underwent a more intense workout and therefore will improve their fitness more from the session. The official Garmin website describes the 0-5 scale as such: “The training effect scale for both aerobic and anaerobic is: 0 – None, 1 – Minor, 2 – Maintaining, 3 – Improving, 4 – Highly Improving, 5 – Overreaching”.

Statistical Summary of Aerobic TE by Runner
Runner mean median stdev
Collie 4.25 4.3 0.55
Hound 3.56 3.5 0.72

It seems that the two athletes prefer to exercise at different intensities as Collie averages a significantly higher aerobic TE score than Hound. Interestingly, Collie runs at an approximately 25% faster pace than Hound and averages an 18% higher average aerobic TE. One interpretation could be that the runners have similar peak potential but Collie prefers to train at maximum capacity and faster pace far more often as shown by the volume of level 5 running sessions. This can be verified or disproven by looking into a measure of cardiovascular efficiency.

Taking a more scientific approach will help us to understand which runner’s cardiovascular system is more efficient and who has become a more effective runner over the course of their training. This measure of running efficiency is the most robust test of cardiovascular fitness that we have available given the data set. Running efficiency, as calculated here, is a ratio of distance covered per minute divided by the average heart rate. This gives us an indicator of how much energy (in terms of heart rate) a runner uses to cover a kilometer. The higher the running efficiency value, the less energy is required for the same distance, indicating better aerobic fitness. In the analysis, we can visualize how the running efficiency changes over time for each runner. The trend lines help show improvements in efficiency, with rising values indicating increased cardiovascular fitness. By analyzing efficiency across months, we get a clearer picture of which runner is improving at a faster rate while also gaining insight as to whether there may be seasonality present in the data. We have chosen to omit measures of metabolic efficiency and calorie consumption as they are not salient to running performance in this context.

Starting in 2018, Hound has remained fairly consistent with their running efficiency. This data reveals consistent effort and fitness levels. As time passes, the variability of Hound’s running efficiency starts to grow. Hound’s highest points in running efficiency display an improvement in fitness level. However, Hound’s lowest points get much lower over time. This increased variability may stem from a combination of factors, such as weather, elevation, or ascent. Overall, Hound’s average running efficiency remain relatively unchanged from 2018, to 2024.

Starting in 2020, Collie’s running efficiency has had a consistency that was comparable to Hounds. Over time, Collie’s performance shows a steady improvement with similar variability but a clear upward trend in efficiency. despite a temporary break from running that takes place between late 2021 and continuing into 2022, Collie’s running efficiency continues with an upward trend and consistent improvement. These findings reveal that, while Hound’s top end efficiency had improved, he also displayed a lack of consistency. In contrast, Collie demonstrates much greater consistency and steady improvement over time.

Multivariate Regression for Collie
term estimate std.error statistic p.value
(Intercept) 0.289534 0.358493 0.807640 0.420272
Date 0.000055 0.000019 2.944257 0.003627
Multivariate Regression for Hound
term estimate std.error statistic p.value
(Intercept) 0.989706 0.309019 3.202731 0.001509
Date 0.000005 0.000017 0.275332 0.783252

Reviewing Collies output, we see a slope of 0.000055. This small number is due to running efficiency being measured on a smaller scale compared to the number of days we have for these runners, since we are comparing trends between Collie and Hound, the unit size does not affect the analysis. 0.000055 indicates that Collie’s running efficiency improves by 0.000055 units per day, proving that this runner is making improvements. Collies data also shows a statistically significant p-value of 0.0036, equating to our slope calculation being significant.

Analyzing Hounds data, we notice that the slope is over 10 times less than Collies, showing that Hounds improvement over time is very limited. Also, a high p-value demonstrates that the observed trend in Hound’s efficiency is not statistically meaningful

The takeaways from this data is that Collie has a positive and significant trend upward, showing improvements in both pace and heart rate. In comparison, Hound shows no meaningful progress in heart rate or pace over the time frame of this data.

One key factor influencing variability in running efficiency is seasonality. Being in Canada, seasonal changes can have an impact on your physical performance. Multiple factors such as weather, food, daylight, and sleep patterns can have an impact on an athlete’s ability to perform consistently throughout the year.

The chart shows that both of these runners are impacted by seasonality. Efficiency tends to peak in the late summer, likely due to having more favorable running weather and longer daylight. In contrast, efficiency finds a low in January showing the challenging winter conditions.

This pattern highlights the importance of seasonality in understanding performance fluctuations. While both runners are impacted, Collie’s trend appears to be more consistent throughout different seasons. This consistency not only reflects physical strength but also mental. The ability to stay more consistent year round demonstrates resilience and determination.

There are a myriad of factors that could effect a runners ability to run at their typical pace or efficiency. It is important to study how adverse conditions affect each runner as if one runner has an achillies heel when it comes to a factor such as incline, heat, or very long distances, they would not be as suitable for high-intensity races that involve varied terrain, hot weather, or long endurance competitions. Out of those three factors, it appears that slope has a very strong effect on efficiency for both runners while the other two factors play less of a role.

In the case of each runner specifically, slope has a strong affect on efficiency. For every 1m/km increase in slope, Collie loses 0.006 points of efficiency, and Hound loses 0.013. This means that while both runners spend more energy going up hills as one would expect, Hound struggles significantly more than Collie does. Collie actually excels when it comes to longer runs showing a increased efficiency with distance. Hound on the other hand is fairly consistent when it comes to running efficiency over different distances. While the graph shows great efficiency gains for Hound’s marathon length runs, these are not numerous enough to be statistically significant. The table at the top show a multivariate regression of the factors and the graphs show the nonlinear and linear relationships between each of the factors against efficiency. Note that the visualization for heat has been included despite it not being a statistically significant factor for either runner as shown by the p values in the regression.

Multivariate Regression Collie
term estimate std.error statistic p.value
(Intercept) 1.381 0.112 12.353 0.000
Distance 0.007 0.001 4.545 0.000
Slope -0.006 0.000 -15.947 0.000
Heat -0.001 0.004 -0.340 0.734
Multivariate Regression Hound
term estimate std.error statistic p.value
(Intercept) 1.185 0.020 58.044 0.000
Distance 0.000 0.001 -0.238 0.812
Slope -0.013 0.002 -8.481 0.000
Heat 0.000 0.001 -0.671 0.503

In conclusion, after carefully analyzing multiple aspects of fitness, including pace, training intensity, running efficiency, and how both athletes perform under different conditions, it is clear that Collie is the superior runner. Collie consistently exhibits faster pace, greater aerobic fitness, and more reliable improvements in cardiovascular efficiency over time due to a more rigorous training regimen. Additionally, while both runners are impacted by challenging conditions such as slope, Collie is more resilient, showing significantly better performance on steep slopes. These combined factors strongly suggest that Collie is better suited for the high-intensity and varied conditions of the Masters Cross-Country World Championships and thus he will be the selected athlete.