Recently, Nike launched a new line of marathon racing shoes, which are commonly referred to as Vaporflys. Several studies have reported dramatic speed improvements for runners wearing these shoes. I would like to use the Bayesian approach to estimate the magnitude of the improvement this shoes make and test whether the advantage on the performance different between gender, runner and course.
The data are downloaded from the recent paper https://www.researchers.one/article/2020-02-14. There are two .csv files, separated by gender and each data set has the marathon times by runner, course and whether they were wearing the Vaporflys shoes.
Total of 1690 data points are recorded. After removing the data point that has missing value, 840 records are from 308 men and 778 records are from 270 female. Noted that there are 21 different courses for male and 21 different courses for female. However, they are not the same.
name_age | match_name | full_name | marathon | year | date | time_minutes | vaporfly | |
---|---|---|---|---|---|---|---|---|
1 | Aaron Braun (M30) | AARON BRAUN | AARON BRAUN | Chicago Marathon | 2017 | 2017/10/8 | 133.6833 | FALSE |
2 | Braun Aaron (M) | AARON BRAUN | AARON BRAUN | Chicago Marathon | 2018 | 2018/10/7 | 133.2667 | FALSE |
3 | Aaron Braun (M27) | AARON BRAUN | AARON BRAUN | Houston Marathon | 2015 | 2015/1/18 | 132.9000 | FALSE |
4 | Abayneh Ayele (M28) | ABAYNEH AYELE | ABAYNEH AYELE | Chicago Marathon | 2016 | 2016/10/9 | 133.8667 | FALSE |
5 | Abayneh Ayele Woldegiorgis (M29) | ABAYNEH AYELE | ABAYNEH AYELE WOLDEGIORGIS | Houston Marathon | 2017 | 2017/1/15 | 132.7333 | FALSE |
7 | Abdi Abdirahman (M40) | ABDI ABDIRAHMAN | ABDI ABDIRAHMAN | Boston Marathon | 2017 | 2017/4/17 | 132.7500 | TRUE |
Courses included in the male dataset:
## [1] "Chicago Marathon" "Houston Marathon"
## [3] "Boston Marathon" "New York City Marathon"
## [5] "Ottawa Marathon" "Toronto Waterfront Marathon"
## [7] "Grandma's Marathon" "Twin Cities Marathon"
## [9] "Philadelphia Marathon" "Cal Intl Marathon"
## [11] "Men's Olympic Trials Marathon" "LA Marathon"
## [13] "Indianapolis Mon Marathon" "Vermont City Marathon"
## [15] "Eugene Marathon" "Columbus Marathon"
## [17] "Richmond Marathon" "Vancouver Intl Marathon"
## [19] "Phoenix Marathon" "Marine Corps Marathon"
## [21] "Wineglass Marathon"
name_age | match_name | full_name | marathon | year | date | time_minutes | vaporfly |
---|---|---|---|---|---|---|---|
Caroline Rotich (F30) | CAROLINE ROTICH | CAROLINE ROTICH | Boston Marathon | 2015 | 2015/4/20 | 144.9167 | FALSE |
Mare Dibaba (F25) | MARE DIBABA | MARE DIBABA | Boston Marathon | 2015 | 2015/4/20 | 144.9833 | FALSE |
Buzunesh Deba (F27) | BUZUNESH DEBA | BUZUNESH DEBA | Boston Marathon | 2015 | 2015/4/20 | 145.1500 | FALSE |
Desiree Linden (F31) | DESIREE LINDEN | DESIREE LINDEN | Boston Marathon | 2015 | 2015/4/20 | 145.6500 | FALSE |
Sharon Cherop (F31) | SHARON CHEROP | SHARON CHEROP | Boston Marathon | 2015 | 2015/4/20 | 146.0833 | FALSE |
Caroline Kilel (F34) | CAROLINE KILEL | CAROLINE KILEL | Boston Marathon | 2015 | 2015/4/20 | 146.6667 | FALSE |
Courses included in female dataset:
## [1] "Boston Marathon" "Cal Intl Marathon"
## [3] "Chicago Marathon" "Columbus Marathon"
## [5] "Eugene Marathon" "Grandma's Marathon"
## [7] "Houston Marathon" "Indianapolis Mon Marathon"
## [9] "LA Marathon" "Lakefront Marathon"
## [11] "New York City Marathon" "Ottawa Marathon"
## [13] "Philadelphia Marathon" "Phoenix Marathon"
## [15] "Richmond Marathon" "Toronto Waterfront Marathon"
## [17] "Twin Cities Marathon" "Vancouver Intl Marathon"
## [19] "Vermont City Marathon" "Wineglass Marathon"
## [21] "Women's Olympic Trials Marathon"
Gender | date | marathon | Without_Vaporflys | With_Vaporflys | Time_different |
---|---|---|---|---|---|
Female | 2018/10/7 | Chicago Marathon | 161.2444 | 155.3833 | -5.8611111 |
Female | 2018/10/7 | Twin Cities Marathon | 160.1667 | 153.1667 | -7.0000000 |
Female | 2018/11/3 | Indianapolis Mon Marathon | 167.1833 | 164.7667 | -2.4166666 |
Female | 2018/11/4 | New York City Marathon | 154.4083 | 149.3611 | -5.0472222 |
Female | 2018/12/2 | Cal Intl Marathon | 157.6167 | 157.9900 | 0.3733333 |
Female | 2018/4/16 | Boston Marathon | 174.5042 | 168.9333 | -5.5708333 |
Female | 2018/6/16 | Grandma’s Marathon | 160.1125 | 160.4167 | 0.3041666 |
Female | 2019/1/20 | Houston Marathon | 143.4667 | 159.8250 | 16.3583333 |
Female | 2019/10/13 | Chicago Marathon | 154.3667 | 160.2611 | 5.8944444 |
Female | 2019/10/20 | Toronto Waterfront Marathon | 172.9833 | 155.5667 | -17.4166666 |
Female | 2019/10/6 | Twin Cities Marathon | 158.7667 | 154.1167 | -4.6500000 |
Female | 2019/11/3 | New York City Marathon | 147.4833 | 157.6333 | 10.1499999 |
Female | 2019/4/15 | Boston Marathon | 163.8870 | 154.0722 | -9.8148148 |
Female | 2019/6/22 | Grandma’s Marathon | 156.9167 | 166.9583 | 10.0416666 |
Male | 2018/1/14 | Houston Marathon | 133.0917 | 128.8083 | -4.2833333 |
Male | 2018/10/7 | Twin Cities Marathon | 137.4222 | 141.6833 | 4.2611111 |
Male | 2018/11/18 | Philadelphia Marathon | 139.4417 | 137.1500 | -2.2916666 |
Male | 2018/11/4 | New York City Marathon | 149.0000 | 140.1583 | -8.8416667 |
Male | 2018/12/2 | Cal Intl Marathon | 139.6222 | 142.2694 | 2.6472222 |
Male | 2018/4/16 | Boston Marathon | 138.7778 | 146.4889 | 7.7111111 |
Male | 2018/5/27 | Ottawa Marathon | 142.5056 | 138.5667 | -3.9388888 |
Male | 2019/1/20 | Houston Marathon | 150.8222 | 134.1167 | -16.7055556 |
Male | 2019/10/13 | Chicago Marathon | 131.2333 | 135.8444 | 4.6111111 |
Male | 2019/11/24 | Philadelphia Marathon | 146.5667 | 136.5167 | -10.0500000 |
Male | 2019/11/3 | New York City Marathon | 133.4667 | 137.3633 | 3.8966667 |
Male | 2019/11/9 | Indianapolis Mon Marathon | 144.2500 | 138.7667 | -5.4833334 |
Male | 2019/12/8 | Cal Intl Marathon | 144.8500 | 138.5208 | -6.3291667 |
Male | 2019/3/24 | LA Marathon | 131.7833 | 132.0167 | 0.2333334 |
Male | 2019/4/15 | Boston Marathon | 138.4929 | 138.4750 | -0.0178571 |
Male | 2019/6/22 | Grandma’s Marathon | 142.5583 | 136.7333 | -5.8250000 |
In the data summary, we can see that Vaporflys become more popular after 2018. Also, we can discover that sometimes they help the runner but sometimes they don’t. Therefore, a more complete research should be done in order to make a conclusion.
Since the data is separated by gender and the courses are different between male and female. I decided to build up two different models for each gender. The response variable is going to be finishing time (\(Y_i\)) \(i=1,2,...,840\) for male and \(i=1,2,...,778\) for female. And the most important variable in this research is the Vaporflys (\(X_i\)). Therefore, We set up the Vaporflys as the fix effect to see if it helps runner to finish marathon faster. Also, because of the course (\(M_k\)) and the runner (\(R_j\)) are different, I set them as the random effect, therefore, my final model is going to be \[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k\]
\(j=1,2,...,308\) for male and \(j=1,2,...,270\) for female
\(k=1,2,...,21\) for both male and female
In each model, I’ll try 5 different prior
(1) Set fix effect has Normal prior and the random effect has double exponential as prior
(2) Both Fix and Random effect has normal as prior
(3) Let the fix effect using double exponential and random effect using normal as prior
(4) Add an intercept term in the model, the intercept has prior using normal distribution centered at the mean of Y, set fix effect as double exponential and random effect as normal
(5) Similar to model 4 but change fix effect into normal prior.
In this part I will use R jags
to run the MCMC for each model. 2 chain and 25000 iteration will be run. The first 5000 iteration were discarded as a burn-in and the rest of 20,000 are sampled. For the model checking, I’ll show you the trace plot to see if the parameter convergence during the MCMC. Also, Effective sample size , Geweke Diagnostic and Gelman-Rubin statistic will be provided to help us check the convergence of the models. Ideally, the effective sample size should be larger than 1000. The absolute Geweke Diagnostic value should beless than 2. And, Gelman-Rubin statistic should be approximately 1 not over 1.1
\[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\beta \sim N(0, 1/1000)\] Random effect: \[\beta^R_j \sim dexp(0, \epsilon_i*\epsilon_R)\] \[\beta^M_k \sim dexp(0, \epsilon_i*\epsilon_M)\] \[R_j \sim dexp(0, \epsilon_i*\epsilon_r)\] \[M_k \sim dexp(0, \epsilon_i*\epsilon_m)\]
Effective sample size:
beta
49114.52
Geweke Diagnostic:
[[1]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta -0.7439
[[2]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta -1.149
Gelman-Rubin statistic:
Potential scale reduction factors:
Point est. Upper C.I.
beta 1 1
\[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\beta \sim N(\mu_\beta, 1/1000)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\] \[\mu_\beta \sim N(0,1/100)\]
Effective sample size:
beta
7692.271
Geweke Diagnostic:
[[1]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta -0.3377
[[2]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta -1.073
Gelman-Rubin statistic:
Potential scale reduction factors:
Point est. Upper C.I.
beta 1 1.01
\[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\beta \sim dexp(0, \epsilon_i*\epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\]
Effective sample size:
beta
7115.319
Geweke Diagnostic:
[[1]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta -1.183
[[2]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta -1.524
Gelman-Rubin statistic:
Potential scale reduction factors:
Point est. Upper C.I.
beta 1 1
\[Y_i=\alpha +(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\alpha \sim N(\mu, 1/1000)\] \[\beta \sim dexp(0, \epsilon_i*\epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\] \[\mu \sim N(135, 1/100)\]
Effective sample size:
beta
7082.916
Geweke Diagnostic:
[[1]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta
-0.04491
[[2]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta -3.517
Gelman-Rubin statistic:
Potential scale reduction factors:
Point est. Upper C.I.
beta 1 1
\[Y_i=\alpha+(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\alpha \sim N(\mu, 1/1000)\] \[\beta \sim N(0, \epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\] \[\mu \sim N(135, 1/100)\]
Effective sample size:
beta
7761.908
Geweke Diagnostic:
[[1]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta 1.304
[[2]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta 0.8697
Gelman-Rubin statistic:
Potential scale reduction factors:
Point est. Upper C.I.
beta 1 1
\[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\beta \sim N(0, 1/1000)\] Random effect: \[\beta^R_j \sim dexp(0, \epsilon_R*\epsilon_i)\] \[\beta^M_k \sim dexp(0, \epsilon_M*\epsilon_i)\] \[R_j \sim dexp(0, \epsilon_r*\epsilon_i)\] \[M_k \sim dexp(0, \epsilon_m*\epsilon_i)\]
Effective sample size:
beta 50000
Geweke Diagnostic:
[[1]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta -2.52
[[2]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta -0.9429
Gelman-Rubin statistic:
Potential scale reduction factors:
Point est. Upper C.I.
beta 1 1
\[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\beta \sim N(\mu, \epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\] \[\mu \sim N(0, 1/100)\]
Effective sample size:
beta
5543.696
Geweke Diagnostic:
[[1]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta -0.8624
[[2]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta -1.173
Gelman-Rubin statistic:
Potential scale reduction factors:
Point est. Upper C.I.
beta 1 1.01
\[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\beta \sim dexp(0, \epsilon_i*\epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\]
Effective sample size:
beta
9958.377
Geweke Diagnostic:
[[1]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta 0.04558
[[2]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta -1.014
Gelman-Rubin statistic:
Potential scale reduction factors:
Point est. Upper C.I.
beta 1 1
\[Y_i=\alpha +(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\alpha \sim N(\mu, 1/1000)\] \[\beta \sim dexp(0, \epsilon_i*\epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\] \[\mu \sim N(140, 1/100)\]
Effective sample size:
beta 10678.8
Geweke Diagnostic:
[[1]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta 0.4643
[[2]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta 1.046
Gelman-Rubin statistic:
Potential scale reduction factors:
Point est. Upper C.I.
beta 1 1
\[Y_i=\alpha +(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\alpha \sim N(\mu, 1/1000)\] \[\beta \sim N(0, \epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\] \[\mu \sim N(140, 1/100)\]
Effective sample size:
beta
8911.889
Geweke Diagnostic:
[[1]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta 0.1462
[[2]]
Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5
beta 2.016
Gelman-Rubin statistic:
Potential scale reduction factors:
Point est. Upper C.I.
beta 1 1
After running all the model, we can see that \(\beta\) can converge on every prior we choose no matter which gender. Only model 3 fail on the Geweke Diagnostic for both male and female. Next, let’s use DIC and WAIC to choose the best model between them.
DIC comparision for male:
$Model1 Mean deviance: 4982 penalty 204.6 Penalized deviance: 5186
$Model2 Mean deviance: 4975 penalty 203 Penalized deviance: 5178
$Model3 Mean deviance: 4974 penalty 202.6 Penalized deviance: 5177
$Model4 Mean deviance: 4969 penalty 200.8 Penalized deviance: 5169
$Model5 Mean deviance: 4970 penalty 200.1 Penalized deviance: 5170
DIC comparision for female:
$Model1 Mean deviance: 5186 penalty 143.3 Penalized deviance: 5329
$Model2 Mean deviance: 5184 penalty 143.4 Penalized deviance: 5327
$Model3 Mean deviance: 5184 penalty 142.7 Penalized deviance: 5326
$Model4 Mean deviance: 5182 penalty 143.5 Penalized deviance: 5326
$Model5 Mean deviance: 5183 penalty 144 Penalized deviance: 5327
Male | Female | |
---|---|---|
model1 | 5202.250 | 5323.835 |
model2 | 5170.537 | 5312.418 |
model3 | 5169.926 | 5312.400 |
model4 | 5162.603 | 5312.108 |
model5 | 5162.739 | 5312.158 |
After seeing the DIC and WAIC, I believe model 4 is the best model for both gender.
Fixed Vaporflys effect
##
## Iterations = 9001:39000
## Thinning interval = 1
## Number of chains = 2
## Sample size per chain = 30000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## -2.025343 0.679056 0.002772 0.007385
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## -3.3260 -2.4738 -2.0402 -1.5961 -0.6313
Fixed Vaporflys effect
##
## Iterations = 6001:36000
## Thinning interval = 1
## Number of chains = 2
## Sample size per chain = 30000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## -2.316291 1.099596 0.004489 0.009905
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## -4.4295 -3.0656 -2.3465 -1.5804 -0.1108
To conclude, The Vaporflys seem to make a really great improvement for runner no matter for male or female. The \(\beta\) has the mean -2 for male and -2.3 for female. That is a incredible improvement for a marathon runner. On the other hand, I will say that the improvement doesn’t not have a significantly different between runner and courses. Since all the random effect \(\beta_r\) and \(\beta_m\) both including 0 in their 95% quantiles.Therefore, this is telling us that the Vaporfly is actually a fix effect.
To clarify the improvement between Male and Female, I use the sampled beta to do the t test. And the result show below says that the improvement for male and female are not the same. That is to say, Vaporflys really help a lot for marathon runner especially for women.
##
## Welch Two Sample t-test
##
## data: MaleBeta and FemaleBeta
## t = 37.054, df = 48837, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.2646301 0.2941892
## sample estimates:
## mean of x mean of y
## -2.035049 -2.314459