by Steve Jong
Reprinted from Usability Interface, Vol 7, No. 2, October 2000
ISO defines usability as “a measure of the effectiveness, efficiency, and satisfaction with which specified users can achieve goals in a particular environment.” To those of us who are interested in documentation quality metrics, this definition is marvellous. We quibble over what typeface maximizes readability, and to what extent readability affects quality, but there’s no arguing with the value of reducing time to task completion. Usability testing closes the loop in the product development cycle; without it, development can spiral off in the wrong direction. Upcoming revisions to the ISO quality standards not only incorporate usability as a quality requirement but also mandate its measurement. To me, usability is an excellent working definition of product quality, and the field of usability is rich in metrics. The briefest web search yields dozens of metrics, such as time to completion of a task, percentage of tasks completed without errors, number of commands or keystrokes or mouse clicks used, number and type of errors per task, percentage of satisfied users and their degree of satisfaction. Many are critical success factors. In my view, usability metrics are concrete, meaningful, objective, causative, and clearly useful.
Usability metrics are evaluative because they describe the results of a development effort. Gathering usability data can be expensive; gathering data on all of a company’s products (or documentation) simply isn’t practical. The question “is this product usable?” leads to the general quality question: “what makes your product usable, and how can I make mine usable too?” Usability professionals offer heuristic rules—suggestions that, when followed, lead to more usable products. We have heuristics of our own. I believe measuring adherence to these rules is a cost-effective way to predict the usability of products or their documentation.
For example, how many steps are there per procedure in your document, and are they all at about the same level of detail? A procedure with 70 steps (yes, I’ve seen one!) takes longer to complete than one with seven. Obviously, the usability of your document is intertwined with the usability of the product it describes; you’re well advised to tell your developers that a 70-step procedure is unusable. Did you adhere to a process of technical review and editing, so that command, argument, window, field, menu, and button names are accurate? Did QA verify your procedures? If so, you can be confident your document’s not causing user errors.
Usability experts dispute the interpretation of their metrics. They wonder if reducing the task time by 12% means a 12% improvement in productivity or a 12% increase in stress. Which carries more weight: time to task completion or user satisfaction? I recognize this yearning for a single usability score. The issue is whether the distribution of values for a given metric is normal. If not, comparisons have little meaning.
The usability literature also speaks of establishing goals, but I would caution against setting them arbitrarily (say, “the user shall complete the task within thirty seconds”). As Dr. Deming said, the data is what it is. A better approach would be to establish the usability of the system by taking measurements, plotting run charts, and determining control limits, then looking for the root causes of variation. Reducing variation, or continual improvement, is the great promise of Deming’s methods.
If you believe that everyone stumbles in the same place or for the same reason, then studying a few users is sufficient to form a reliable opinion of the product’s usability. Otherwise, I think a usability study, like any other statistical effort, requires a randomly selected, statistically significant sample to be valid.