1984 SUS (System Usability Scale)
1984 SUS (System Usability Scale)
1984 SUS (System Usability Scale)
It is the 25th anniversary of the creation of the most used questionnaire for measuring perceptions of usability. The System Usability Scale (SUS) was released into this word by John Brooke in 1986. It was originally created as a "quick and dirty" scale for administering after usability tests on systems like VT100 Terminal ("Green-Screen") applications. SUS is technology independent and has since been tested on hardware, consumer software, websites, cell-phones, IVRs and even the yellow-pages. It has become an industry standard with references in over 600 publications.
1. 2. 3. 4.
Scoring SUS
For odd items: subtract one from the user response. For even-numbered items: subtract the user responses from 5 This scales all values from 0 to 4 (with four being the most positive response). Add up the converted responses for each user and multiply that total by 2.5. This converts the range of possible values from 0 to 100 instead of from 0 to 40.
SUS scores and generates percentile ranks and letter-grades (from A+ to F) for eight different application types. The graph below shows how the percentile ranks associate with SUS scores and letter grades.
This process is similar to "grading on a curve" based on the distribution of all scores. For example, a raw SUS score of a 74 converts to a percentile rank of 70%. A SUS score of 74 has higher perceived usability than 74% of all products tested. It can be interpreted as a grade of a B-. You'd need to score above an 80.3 to get an A (the top 10% of scores). This is also the point where users are more likely to be recommending the product to a friend. Scoring at the mean score of 68 gets you a C and anything below a 51 is an F (putting you in the bottom 15%).
SUS is Reliable
Reliability refers to how consistently users respond to the items (the repeatability of the responses). SUS has been shown to be more reliable and detect differences at smaller sample sizes than home-grown questionnaires and other commercially available ones. Sample size and reliability are unrelated, so SUS can be used on very small sample sizes (as few as two users) and still generate reliable results. However, small sample sizes generate imprecise estimates of the unknown user-population SUS score. You should compute a confidence interval around your sample SUS score to understand the variability in your estimate.
SUS is Valid
Validity refers to how well something can measure what it is intended to measure. In this case that's perceived usability. SUS has been shown to effectively distinguish between unusable and usable systems as well as or better than proprietary questionnaires. SUS also correlates highly with other questionnairebased measurements of usability (called concurrent validity).
and time), which means that only around 6% of the SUS scores are explained by what happens in the usability test. This is the same level of correlation found[pdf] with other post-test questionnaires.