If standards are going to be meaningful, we must have a way of expressing fairly precisely the quality level we expect, and a way of measuring the system's performance against the standards laid down. Informal measurement is better than no measurement, but being able to measure something numerically is a great advantage. Numerical measures of the products and processes of software development are called software metrics. Within the overall software project, there are many kinds of metrics that may be useful, let's go and have a look.
Once something can be measured numerically, you move away from the world of opinion and intuition - "I like using this interface but I don't know why" - to a precise word of facts and figures - "It took me 5 minutes longer to perform the task using the old system than it does use this new one". However, we should issue a word of warning: figures can be misleading if not interpreted properly, or if the wrong measurements are taken. Even if the correct measures are taken, they can still be misleading if interpreted incorrectly.
People are keenly affected by their surroundings, both physically and mentally, so it is important to make sure that measurement takes place in a context that is as close as possible to the target environment.
Various kinds of metrics can be devised to measure the design at different stages of development; these include: analytic, performance, and psychometrics.
Analytic metrics are generally applicable when only a paper-based product specification is available. There are many different performance-related measures that can be taken: how long it takes to modify a sentence, how many errors are made in an hour, how many times the mouse button is clicked, how many cups of coffee the user drinks in a day, and so on. These kinds of measures are at the heart of usability engineering, and when systematically recorded in a usability specification they provide designers with information about acceptable levels of usability. There are four main kinds of performance metrics: duration measures, count measures, proportion of task completed, and quality of output:
Duration metrics measure how much time is spent doing a particular thing, for example, how much time is spent looking at online help screens.
Count measures simply count how many times an event happens, or how many discrete activities are performed, for example, how many errors are made.
It is not easy to measure how much of a task has been completed; however, it can be achieved by carefully setting the task goals, and then counting how many have been completed after a certain time. the final result could be expressed as a numerical percentage of the original task goals.
Again, it is not easy to provide an absolute measure of quality, although it is usually not difficult to identify good or bad quality output.
Psychometrics can only be applied once an operational prototype has been built towards the end of the design. Users complete questionnaires designed to ascertain their attitude to the system after using it for a number of hours. These measures include control, helpfulness, likeability, learnability, and efficiency. Let's have a look at them individually:
Control refers to the user's feeling that the software is responding in a normal and consistent way to commands and input.
Helpfulness refers to the user's perception that the software communicates in a helpful way and assists in the resolution of operational problems.
Affect refers to whether the user feels good, warm, happy or disgruntled when using the system.
Learnability refers to whether users find the software easy to learn, and the documentation easy to use.
Efficiency refers to the user's feeling that the software is enabling the task(s) to be performed in a quick, effective, and economical manner or is hindering performance.
At least ten users should complete the questionnaire if the results are to be significant.