Introduction

Recently, Nike launched a new line of marathon racing shoes, which are commonly referred to as Vaporflys. Several studies have reported dramatic speed improvements for runners wearing these shoes. I would like to use the Bayesian approach to estimate the magnitude of the improvement this shoes make and test whether the advantage on the performance different between gender, runner and course.

Data

The data are downloaded from the recent paper https://www.researchers.one/article/2020-02-14. There are two .csv files, separated by gender and each data set has the marathon times by runner, course and whether they were wearing the Vaporflys shoes.
Total of 1690 data points are recorded. After removing the data point that has missing value, 840 records are from 308 men and 778 records are from 270 female. Noted that there are 21 different courses for male and 21 different courses for female. However, they are not the same.

Data for Male:

name_age match_name full_name marathon year date time_minutes vaporfly
1 Aaron Braun (M30) AARON BRAUN AARON BRAUN Chicago Marathon 2017 2017/10/8 133.6833 FALSE
2 Braun Aaron (M) AARON BRAUN AARON BRAUN Chicago Marathon 2018 2018/10/7 133.2667 FALSE
3 Aaron Braun (M27) AARON BRAUN AARON BRAUN Houston Marathon 2015 2015/1/18 132.9000 FALSE
4 Abayneh Ayele (M28) ABAYNEH AYELE ABAYNEH AYELE Chicago Marathon 2016 2016/10/9 133.8667 FALSE
5 Abayneh Ayele Woldegiorgis (M29) ABAYNEH AYELE ABAYNEH AYELE WOLDEGIORGIS Houston Marathon 2017 2017/1/15 132.7333 FALSE
7 Abdi Abdirahman (M40) ABDI ABDIRAHMAN ABDI ABDIRAHMAN Boston Marathon 2017 2017/4/17 132.7500 TRUE

Courses included in the male dataset:

##  [1] "Chicago Marathon"              "Houston Marathon"             
##  [3] "Boston Marathon"               "New York City Marathon"       
##  [5] "Ottawa Marathon"               "Toronto Waterfront Marathon"  
##  [7] "Grandma's Marathon"            "Twin Cities Marathon"         
##  [9] "Philadelphia Marathon"         "Cal Intl Marathon"            
## [11] "Men's Olympic Trials Marathon" "LA Marathon"                  
## [13] "Indianapolis Mon Marathon"     "Vermont City Marathon"        
## [15] "Eugene Marathon"               "Columbus Marathon"            
## [17] "Richmond Marathon"             "Vancouver Intl Marathon"      
## [19] "Phoenix Marathon"              "Marine Corps Marathon"        
## [21] "Wineglass Marathon"

Data for Female:

name_age match_name full_name marathon year date time_minutes vaporfly
Caroline Rotich (F30) CAROLINE ROTICH CAROLINE ROTICH Boston Marathon 2015 2015/4/20 144.9167 FALSE
Mare Dibaba (F25) MARE DIBABA MARE DIBABA Boston Marathon 2015 2015/4/20 144.9833 FALSE
Buzunesh Deba (F27) BUZUNESH DEBA BUZUNESH DEBA Boston Marathon 2015 2015/4/20 145.1500 FALSE
Desiree Linden (F31) DESIREE LINDEN DESIREE LINDEN Boston Marathon 2015 2015/4/20 145.6500 FALSE
Sharon Cherop (F31) SHARON CHEROP SHARON CHEROP Boston Marathon 2015 2015/4/20 146.0833 FALSE
Caroline Kilel (F34) CAROLINE KILEL CAROLINE KILEL Boston Marathon 2015 2015/4/20 146.6667 FALSE

Courses included in female dataset:

##  [1] "Boston Marathon"                 "Cal Intl Marathon"              
##  [3] "Chicago Marathon"                "Columbus Marathon"              
##  [5] "Eugene Marathon"                 "Grandma's Marathon"             
##  [7] "Houston Marathon"                "Indianapolis Mon Marathon"      
##  [9] "LA Marathon"                     "Lakefront Marathon"             
## [11] "New York City Marathon"          "Ottawa Marathon"                
## [13] "Philadelphia Marathon"           "Phoenix Marathon"               
## [15] "Richmond Marathon"               "Toronto Waterfront Marathon"    
## [17] "Twin Cities Marathon"            "Vancouver Intl Marathon"        
## [19] "Vermont City Marathon"           "Wineglass Marathon"             
## [21] "Women's Olympic Trials Marathon"

Data summary

Time Reduce Between with and without Vaporflys
Gender date marathon Without_Vaporflys With_Vaporflys Time_different
Female 2018/10/7 Chicago Marathon 161.2444 155.3833 -5.8611111
Female 2018/10/7 Twin Cities Marathon 160.1667 153.1667 -7.0000000
Female 2018/11/3 Indianapolis Mon Marathon 167.1833 164.7667 -2.4166666
Female 2018/11/4 New York City Marathon 154.4083 149.3611 -5.0472222
Female 2018/12/2 Cal Intl Marathon 157.6167 157.9900 0.3733333
Female 2018/4/16 Boston Marathon 174.5042 168.9333 -5.5708333
Female 2018/6/16 Grandma’s Marathon 160.1125 160.4167 0.3041666
Female 2019/1/20 Houston Marathon 143.4667 159.8250 16.3583333
Female 2019/10/13 Chicago Marathon 154.3667 160.2611 5.8944444
Female 2019/10/20 Toronto Waterfront Marathon 172.9833 155.5667 -17.4166666
Female 2019/10/6 Twin Cities Marathon 158.7667 154.1167 -4.6500000
Female 2019/11/3 New York City Marathon 147.4833 157.6333 10.1499999
Female 2019/4/15 Boston Marathon 163.8870 154.0722 -9.8148148
Female 2019/6/22 Grandma’s Marathon 156.9167 166.9583 10.0416666
Male 2018/1/14 Houston Marathon 133.0917 128.8083 -4.2833333
Male 2018/10/7 Twin Cities Marathon 137.4222 141.6833 4.2611111
Male 2018/11/18 Philadelphia Marathon 139.4417 137.1500 -2.2916666
Male 2018/11/4 New York City Marathon 149.0000 140.1583 -8.8416667
Male 2018/12/2 Cal Intl Marathon 139.6222 142.2694 2.6472222
Male 2018/4/16 Boston Marathon 138.7778 146.4889 7.7111111
Male 2018/5/27 Ottawa Marathon 142.5056 138.5667 -3.9388888
Male 2019/1/20 Houston Marathon 150.8222 134.1167 -16.7055556
Male 2019/10/13 Chicago Marathon 131.2333 135.8444 4.6111111
Male 2019/11/24 Philadelphia Marathon 146.5667 136.5167 -10.0500000
Male 2019/11/3 New York City Marathon 133.4667 137.3633 3.8966667
Male 2019/11/9 Indianapolis Mon Marathon 144.2500 138.7667 -5.4833334
Male 2019/12/8 Cal Intl Marathon 144.8500 138.5208 -6.3291667
Male 2019/3/24 LA Marathon 131.7833 132.0167 0.2333334
Male 2019/4/15 Boston Marathon 138.4929 138.4750 -0.0178571
Male 2019/6/22 Grandma’s Marathon 142.5583 136.7333 -5.8250000


In the data summary, we can see that Vaporflys become more popular after 2018. Also, we can discover that sometimes they help the runner but sometimes they don’t. Therefore, a more complete research should be done in order to make a conclusion.

Methods

Since the data is separated by gender and the courses are different between male and female. I decided to build up two different models for each gender. The response variable is going to be finishing time (\(Y_i\)) \(i=1,2,...,840\) for male and \(i=1,2,...,778\) for female. And the most important variable in this research is the Vaporflys (\(X_i\)). Therefore, We set up the Vaporflys as the fix effect to see if it helps runner to finish marathon faster. Also, because of the course (\(M_k\)) and the runner (\(R_j\)) are different, I set them as the random effect, therefore, my final model is going to be \[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k\]

\(j=1,2,...,308\) for male and \(j=1,2,...,270\) for female

\(k=1,2,...,21\) for both male and female

In each model, I’ll try 5 different prior
(1) Set fix effect has Normal prior and the random effect has double exponential as prior
(2) Both Fix and Random effect has normal as prior
(3) Let the fix effect using double exponential and random effect using normal as prior
(4) Add an intercept term in the model, the intercept has prior using normal distribution centered at the mean of Y, set fix effect as double exponential and random effect as normal
(5) Similar to model 4 but change fix effect into normal prior.

Models

In this part I will use R jags to run the MCMC for each model. 2 chain and 25000 iteration will be run. The first 5000 iteration were discarded as a burn-in and the rest of 20,000 are sampled. For the model checking, I’ll show you the trace plot to see if the parameter convergence during the MCMC. Also, Effective sample size , Geweke Diagnostic and Gelman-Rubin statistic will be provided to help us check the convergence of the models. Ideally, the effective sample size should be larger than 1000. The absolute Geweke Diagnostic value should beless than 2. And, Gelman-Rubin statistic should be approximately 1 not over 1.1

Model for Male

Model 1

\[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\beta \sim N(0, 1/1000)\] Random effect: \[\beta^R_j \sim dexp(0, \epsilon_i*\epsilon_R)\] \[\beta^M_k \sim dexp(0, \epsilon_i*\epsilon_M)\] \[R_j \sim dexp(0, \epsilon_i*\epsilon_r)\] \[M_k \sim dexp(0, \epsilon_i*\epsilon_m)\]

Effective sample size:

beta 

49114.52

Geweke Diagnostic:

[[1]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta -0.7439

[[2]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta -1.149

Gelman-Rubin statistic:

Potential scale reduction factors:

 Point est. Upper C.I.

beta 1 1

Model 2

\[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\beta \sim N(\mu_\beta, 1/1000)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\] \[\mu_\beta \sim N(0,1/100)\]

Effective sample size:

beta 

7692.271

Geweke Diagnostic:

[[1]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta -0.3377

[[2]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta -1.073

Gelman-Rubin statistic:

Potential scale reduction factors:

 Point est. Upper C.I.

beta 1 1.01

Model 3

\[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\beta \sim dexp(0, \epsilon_i*\epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\]

Effective sample size:

beta 

7115.319

Geweke Diagnostic:

[[1]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta -1.183

[[2]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta -1.524

Gelman-Rubin statistic:

Potential scale reduction factors:

 Point est. Upper C.I.

beta 1 1

Model 4

\[Y_i=\alpha +(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\alpha \sim N(\mu, 1/1000)\] \[\beta \sim dexp(0, \epsilon_i*\epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\] \[\mu \sim N(135, 1/100)\]

Effective sample size:

beta 

7082.916

Geweke Diagnostic:

[[1]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta 

-0.04491

[[2]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta -3.517

Gelman-Rubin statistic:

Potential scale reduction factors:

 Point est. Upper C.I.

beta 1 1

Model 5

\[Y_i=\alpha+(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\alpha \sim N(\mu, 1/1000)\] \[\beta \sim N(0, \epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\] \[\mu \sim N(135, 1/100)\]

Effective sample size:

beta 

7761.908

Geweke Diagnostic:

[[1]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta 1.304

[[2]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta 0.8697

Gelman-Rubin statistic:

Potential scale reduction factors:

 Point est. Upper C.I.

beta 1 1

Model for Female

Model 1

\[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\beta \sim N(0, 1/1000)\] Random effect: \[\beta^R_j \sim dexp(0, \epsilon_R*\epsilon_i)\] \[\beta^M_k \sim dexp(0, \epsilon_M*\epsilon_i)\] \[R_j \sim dexp(0, \epsilon_r*\epsilon_i)\] \[M_k \sim dexp(0, \epsilon_m*\epsilon_i)\]

Effective sample size:

beta 50000

Geweke Diagnostic:

[[1]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta -2.52

[[2]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta -0.9429

Gelman-Rubin statistic:

Potential scale reduction factors:

 Point est. Upper C.I.

beta 1 1

Model 2

\[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\beta \sim N(\mu, \epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\] \[\mu \sim N(0, 1/100)\]

Effective sample size:

beta 

5543.696

Geweke Diagnostic:

[[1]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta -0.8624

[[2]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta -1.173

Gelman-Rubin statistic:

Potential scale reduction factors:

 Point est. Upper C.I.

beta 1 1.01

Model 3

\[Y_i=(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\beta \sim dexp(0, \epsilon_i*\epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\]

Effective sample size:

beta 

9958.377

Geweke Diagnostic:

[[1]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta 0.04558

[[2]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta -1.014

Gelman-Rubin statistic:

Potential scale reduction factors:

 Point est. Upper C.I.

beta 1 1

Model 4

\[Y_i=\alpha +(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\alpha \sim N(\mu, 1/1000)\] \[\beta \sim dexp(0, \epsilon_i*\epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\] \[\mu \sim N(140, 1/100)\]

Effective sample size:

beta 10678.8

Geweke Diagnostic:

[[1]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta 0.4643

[[2]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta 1.046

Gelman-Rubin statistic:

Potential scale reduction factors:

 Point est. Upper C.I.

beta 1 1

Model 5

\[Y_i=\alpha +(\beta+\beta^R_j+\beta^M_k)*X_i+R_j+M_k+ \epsilon_i\] With Fix effect: \[\alpha \sim N(\mu, 1/1000)\] \[\beta \sim N(0, \epsilon)\] Random effect: \[\beta^R_j \sim N(0, \epsilon_R)\] \[\beta^M_k \sim N(0, \epsilon_M)\] \[R_j \sim N(0, \epsilon_r)\] \[M_k \sim N(0, \epsilon_m)\] \[\mu \sim N(140, 1/100)\]

Effective sample size:

beta 

8911.889

Geweke Diagnostic:

[[1]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta 0.1462

[[2]]

Fraction in 1st window = 0.1 Fraction in 2nd window = 0.5

beta 2.016

Gelman-Rubin statistic:

Potential scale reduction factors:

 Point est. Upper C.I.

beta 1 1

Model comparision

After running all the model, we can see that \(\beta\) can converge on every prior we choose no matter which gender. Only model 3 fail on the Geweke Diagnostic for both male and female. Next, let’s use DIC and WAIC to choose the best model between them.

DIC

DIC comparision for male:

$Model1 Mean deviance: 4982 penalty 204.6 Penalized deviance: 5186

$Model2 Mean deviance: 4975 penalty 203 Penalized deviance: 5178

$Model3 Mean deviance: 4974 penalty 202.6 Penalized deviance: 5177

$Model4 Mean deviance: 4969 penalty 200.8 Penalized deviance: 5169

$Model5 Mean deviance: 4970 penalty 200.1 Penalized deviance: 5170

DIC comparision for female:

$Model1 Mean deviance: 5186 penalty 143.3 Penalized deviance: 5329

$Model2 Mean deviance: 5184 penalty 143.4 Penalized deviance: 5327

$Model3 Mean deviance: 5184 penalty 142.7 Penalized deviance: 5326

$Model4 Mean deviance: 5182 penalty 143.5 Penalized deviance: 5326

$Model5 Mean deviance: 5183 penalty 144 Penalized deviance: 5327

WAIC

WAIC comparision
Male Female
model1 5202.250 5323.835
model2 5170.537 5312.418
model3 5169.926 5312.400
model4 5162.603 5312.108
model5 5162.739 5312.158

After seeing the DIC and WAIC, I believe model 4 is the best model for both gender.

Result

Male

Fixed Vaporflys effect

## 
## Iterations = 9001:39000
## Thinning interval = 1 
## Number of chains = 2 
## Sample size per chain = 30000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##           Mean             SD       Naive SE Time-series SE 
##      -2.025343       0.679056       0.002772       0.007385 
## 
## 2. Quantiles for each variable:
## 
##    2.5%     25%     50%     75%   97.5% 
## -3.3260 -2.4738 -2.0402 -1.5961 -0.6313

Female

Fixed Vaporflys effect

## 
## Iterations = 6001:36000
## Thinning interval = 1 
## Number of chains = 2 
## Sample size per chain = 30000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##           Mean             SD       Naive SE Time-series SE 
##      -2.316291       1.099596       0.004489       0.009905 
## 
## 2. Quantiles for each variable:
## 
##    2.5%     25%     50%     75%   97.5% 
## -4.4295 -3.0656 -2.3465 -1.5804 -0.1108

To conclude, The Vaporflys seem to make a really great improvement for runner no matter for male or female. The \(\beta\) has the mean -2 for male and -2.3 for female. That is a incredible improvement for a marathon runner. On the other hand, I will say that the improvement doesn’t not have a significantly different between runner and courses. Since all the random effect \(\beta_r\) and \(\beta_m\) both including 0 in their 95% quantiles.Therefore, this is telling us that the Vaporfly is actually a fix effect.
To clarify the improvement between Male and Female, I use the sampled beta to do the t test. And the result show below says that the improvement for male and female are not the same. That is to say, Vaporflys really help a lot for marathon runner especially for women.

## 
##  Welch Two Sample t-test
## 
## data:  MaleBeta and FemaleBeta
## t = 37.054, df = 48837, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.2646301 0.2941892
## sample estimates:
## mean of x mean of y 
## -2.035049 -2.314459