Skip to main content

Measuring and Quantifying User Experience

12th October, 2022

Updated: 12th October, 2022

    UX designers struggle to prove the value of their work when arguments like “user experience is hard to measure” or “how can you prove that it’s better?” start flying around. Let’s take a look at three methods that can help us prove that our work does make a difference.

    Photo by Taras Shypka on Unsplash

    “Data-driven” is all the rage at the moment, everyone wants a slice of the “big data” cake. Data scientists are the new rock stars, replacing the JavaScript and Front-end gurus and ninjas from a few years back. My problem with trends like these is that they cause the so called “tunnel-vision”. Thing x is a trend right now and we should do that too because,… you know… everyone’s doing it.

    It seems like at the moment, companies take pride in how “data-driven” they are. It must be a good thing, because big, fancy companies claim to be so. All these companies risk ending up in a tunnel where quantitative tests claim to provide the answers to all questions. But there’s a reason why there’s a qualitative way of acquiring feedback. The quantitative tests are good in providing information about the ‘what’, but they usually can’t give us an answer to the ‘why’. This is exactly where the qualitative side comes in.

    Good design work is informed by both—the quantitative and the qualitative way of answering questions.

    Balancing the ‘qual’ and the ‘quant’

    Some companies have already started to realise that data alone can’t answer all the questions they need answered. put “Informed by data, driven by empathy” in their design guidelines. We’re following in that direction at Auto Trader where we changed “data-driven” to “data-oriented”. I would prefer “data-informed” but it’s a step in the right direction.

    Qualitative acquiring of feedback is one of the strongest weapons in a UX designer’s arsenal. But they often have problems putting that information forward as it’s often being looked down on as “that’s not a large (enough) sample you have there”. Qualitative tests get trumped by the quantitative ones. The user feedback gets ignored and the product gets delivered based on hunches, “backed” by results of quantitative (usually A/B) tests. See what I mean by tunnel vision? Companies following this process end up in a never-ending chase for a silver bullet. They’re always after a piece of data that will show them the next big feature to build. “Feature x will be a huge hit, trust us”, they claim comfortably. How often do they actually deliver that kickass feature? I have worked for a few companies like that and never saw an example of that coming true. All these companies get caught in a repeating process that, at best, reaches a local maxima.

    Local maxima by Joshua Porter.

    The split of reliance on the two testing methods needs to be 50/50. Qualitative testing needs to be equal to quantitative testing. Together they provide us with insightful information. Information we can confidently act on. Either by conducting further tests or designing and developing another iteration of the product.

    Quantifying user experience

    As UX designers, we need to challenge the sole reliance on data-backed hunches. UX research must be at the core of the business and with it the qualitative way of acquiring feedback. It might be hard to get started in a company that is “data-driven” and the best way to challenge it is to quantify the UX research data. I believe that almost any type of qualitative feedback can be quantified. So far I’ve used three ways of quantifying the qualitative feedback.

    1. Quantifying user testing feedback

    • Time & effort required: medium to high.
    • Great for: evaluating whole user experiences and identifying pain points and usability problems.
    • Recommended number of participants: depends on the budget and timing. Can be from merely 5 and all the way up to 100. My recommendation is more iterations with smaller groups (5–10).
    • Key metrics: perceived difficulty rating, time required to complete the task, errors committed and task success rate.

    I would recommend taking this approach when you’re working on a new product or feature that’s in the later stages of product development. This approach is great for testing an overall user experience and identifying potential usability problems. Does the flow of actions required to perform a task make sense? Are the users getting stuck at certain points? Are they getting distracted? Is it unclear what needs to be done next?

    The key to quantifying this feedback is asking about perceived difficulty for each step/task: How would you rate the difficulty of this step/task from 1 to 7, 1 being easy, 7 being very hard? Here’s a step-by-step guide to quantifying user testing feedback.

    Devise a testThis works well with both, moderated and unmoderated user testing. Decide which tasks you want the participants to complete. These can be as simple as “Upload images to a web based gallery” and “save the current post” to more complex like “Find a silver floor lamp that you like and purchase it”. The granularity of the breakdown of tasks depends on what you’re trying to find out and your preferences.

    The test shouldn’t be longer than up to 15 simple, or 5 complex tasks. Don’t forget to do the test run to identify things you might have missed. Go through the tasks and perform them yourself. Don’t skip this, it’s important.

    An example of 15 tasks to be tested.

    Come up with a way to quantify the results (1–7 questions etc) Questions about difficulty is the simplest way to quantify user testing feedback. Measuring the time required to finish a task is also quite common. I have rarely used it because I like user testing sessions where participants freely express their thoughts. Measuring task time doesn’t work well in this case because the participants start to talk in the middle of the task and stopping them would be foolish and maybe even rude. Success rate and error rate are interesting metrics to pay attention to. They complement the perceived difficulty question well.

    Prepare a way to analyse the data Data is meaningless if you can’t analyse it. The easiest way to analyse user testing feedback is to prepare a sheet document with the list of tasks you decided to test. Next to those put into separate columns the metrics that you want to track: perceived difficulty, time required, error/success rate. This makes it easy to put in the results during the actual user testing session.

    The table in the example above isn’t the sexiest but it’s efficient. It has a list of all the tasks that a participant needs to go through on the left. Each task has been assigned a level of importance: medium or high. After that, there’s a column where the note taker will make a note whether the user completed the task successfully immediately (direct success—3 points) or did she fail first but got around it in the next attempts (indirect success—1 point) or even failed completely (0 points). Then there’s the question about perceived difficulty: rating 1 means easy so maximum (7) points get assigned.

    Overall score = Success score + Difficulty score

    The maximum possible score in the example above is 10. If the score is 6 or lower the cell is coloured yellow, if 2 or less it’s coloured red as it’s a high priority to resolve.

    Run the tests With the list of tasks, a way to collect and quantify the information you’re now ready to go. Don’t forget to write down the numbers as you go. Ideally, you should run these tests with a partner, so one person moderates, the other keeps notes.

    Analyse the data This step requires more time if you ran unmoderated tests in the previous steps ( comes to mind). If that’s the case, you now need to watch all the videos and put the numbers into your document where you keep track of the metrics.

    If you ran moderated tests and you had a partner that kept track of those during the test, some number crunching is all what’s left. Calculate the averages for each user and the average for each task. Do that for each metric.

    Present the results & act on them Now comes the fun part. It’s time to present those numbers in a way other people in your company can understand. Create charts for the metrics that you track. Each metric per task makes the most sense here.

    In the example above, I calculated the averages of success, difficulty and overall scores for each task. I separated those by sessions as we did small iterations between them. There were around 10 participants in each session.

    On the far right we have the all-time scores: averages calculated from all the sessions. The average task score was 8.18 and the low average was 6.06.

    I then went and put the results in a visual form so they were easier to interpret. In the left chart we could see that only a few tasks had very poor average scores (higher is better). Interestingly, those poor scores correlated with the success scores (illustrated by the chart on the right). Based on these results we decided to go back to the drawing board to try and fix the issues we identified.

    When we redesigned the parts that were the most problematic and retested the app the average score was 9.65 and the low average was 7.5. The newly designed app was so much better that there were no more failures in trying to complete tasks after the redesign. This approach takes a bit more time and effort but it’s worth it because the results are usually easy to act on.

    2. Quantifying feedback for quick UX/UI design decisions

    • Time & effort required: low
    • Great for: Optimising user experiences and interactions by getting the details right.
    • Recommended number of participants: 25–100.
    • Requires a tool like Usability Hub or UserZoom.

    This approach works best when working on a particular part of a user interface. It’s a great way to answer questions like:

    • Is it clear enough what this button does?
    • Do users understand what this part is and that they can interact with it?
    • Does introducing a particular element or feature affect user behaviour?

    These tests can give answers to such questions in less than an hour or two. Platforms like Usability Hub and UserZoom are great for performing these kinds of tests. Click tests (where users are asked a question and answer by clicking on a design) and question tests (where users are shown a design and asked questions—either open ended or close ended) are the best for this kind of testing. Here’s an overview of how you can do it.

    Tip: Usability Hub allows you to recruit your own users by giving you a shareable URL to your test so you can get feedback for free.

    Devise the test so that the results are quantifiable The questions are the most important part of this test. They need to be shaped so the results are easy to quantify. For example: What do you think will happen after you click on the red button? Followed by a closed list of options: option 1, option 2, option x,… and a final option of “I don’t know”.

    Again, don’t forget to do a test run of the test. You’ll waste resources if you get it wrong.

    Run the test Once your test is designed, you’re ready to go. Choose the number of participants you want to take the test and launch it. The time to get results depends on the number of participants but it shouldn’t take longer than an hour (if you use Usability Hub’s recruitment pool).

    Analyse the data Go ahead and do some number crunching once you get the results back. Exporting results to a spreadsheet helps with that.

    Present results & act on them Put the results into a form that can be presented to others. It takes a bit more effort to do this but it’s worth it. It’s so much easier to present the results in a clear way, than showing raw data. People will believe the numbers you put in front of them, they don’t care about the raw data.

    This were two separate “Question tests” on usability Hub where users were shown a design and asked two questions—one open ended, the other closed ended. One of the two designs was the control version and the other one was the new proposed design. With this test we were able to prove that users understand a key piece of UI in the new design.

    So far, I found that a sample of around 25 to 50 participants is required to convince other stakeholders that the results are credible and should be acted upon.

    3. Run the “experience rating” poll with a quantitative test

    • Time & effort required: low/medium
    • Great for: Quantifying whole user experiences on a large scale.
    • Requires a tool like HotJar or a similar polling tool.

    The third way is great to run in tandem with a quantitative (A/B test). Let’s say we completely redesigned an experience in our product. The team decides to run an A/B test where they want to compare key metrics of the new design compared to the control. The quantitative test alone will tell them how the metrics move but in some cases a metric can be ambiguous. Page views for example can be a misleading metric. One could interpret it in two ways:

    • less page views means that the users are finding what they’re looking for quicker so the experience is better or
    • less page views means that the users are finding the experience worse so they’re leaving the page sooner.

    So why not ask them? Create a poll with a question like “How would you rate your experience today?” and let them answer from a range of options—usually from 1 to 5.

    Here’s an example of how we did that at Auto Trader. We couldn’t use Hotjar in our example so we custom-built the poll and tracked the results with Google Analytics.

    In the case of the page views example from before, the change in that metric was now easier to interpret.

    Why bother

    All this may seem like a lot of work but it isn’t that much work and you get to find out if you’re designing better user experiences or not. Once done, you’re armed with two things people believe the most: data that is easy to interpret and the visual presentation of it. The results will be based on testing with a significant number of participants so they’re harder to dispute. So far, this is the best way for me to put qualitative testing alongside the quantitative. What we get in the end is insightful information we can act upon, unlike,… you know, data alone. Data that we often don’t really know how to interpret.


    Created on: 12th October, 2022

    Last updated: 12th October, 2022


    Tagged With: