Date of Completion


Document Type

Honors College Thesis


Rehabilitation and Movement Sciences

Thesis Type

Honors College

First Advisor

Bruce Beynnon

Second Advisor

Ge Wu

Third Advisor

Kelly Tourville


Reliability, Accuracy, FIFA 11+, Outcome Measure


The FIFA 11+ Pre-Participation study at the University of Vermont is investigating the impact that FIFA 11+ has on injuries to high school athletes. The FIFA 11+ Pre-Participation program has been shown to reduce the incidence of lower extremity injury in elite soccer athletes; however, it is unclear if the program has a similar effect on developing high school athletes or if it can reduce injury in other sports. Recognizing that an athletes’ compliance with the FIFA 11+ program may be directly linked to the effectiveness of the program, an outcome measure that documented compliance was developed. The outcome measure that was developed was a form designed to record pre-participation components of a warm-up by observers for the FIFA 11+ study. The objective of this investigation was to establish the intrarater and interrater reliability and accuracy of this compliance outcome measure.

A repeated-measures study design was used to determine the reliability and accuracy of the outcome measure. The examiners who collected data for the FIFA 11+ study were asked to volunteer for this investigation, which involved attending two observation sessions that were two weeks apart. The observation sessions involved watching five warm-up videos, each one about ten minutes long, and then recording what occurred in the warm-up. They used the outcome measure to record their observations of the same five pre-participation warm-ups during each session. The outcome measure had 66 warm-up exercises, or components, that could be recorded. Intraclass Correlation Coefficients (ICCs) were used to determine the intrarater and interrater reliability for each component of the outcome measure. A component with an ICC above 0.60 was considered reliable for this study. The sensitivity and specificity of each component, as well as percent agreement of the examiners with the gold standard examiner for each component were used to determine the accuracy. A component was accurate if above 60% of the observations were in agreement with the gold standard examiner that the component either was or was not present in the warm-up. If any components were proven unreliable or inaccurate the outcome measure was simplified by reducing the number of components. The new components, which were each a result of combining two unreliable components, had ICCs, sensitivity and specificity recalculated as if all observations of either of the original components counted toward the new component.

The outcome measure was established to be partially reliable and partially accurate. Out of the 34 components observed there were five components that were intrarater unreliable and 18 components that were interrater unreliable. All of the components that were intrarater unreliable were also interrater unreliable. Only one component was inaccurate with 58% of the observations of that component in agreement with the gold standard examiner’s observations. Of the total 18 unreliable components, seven were combined with another component to simplify the outcome measure.

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.