The Limits of Rapid Prototyping in AI

Rapid prototyping tests the human consequences of design ideas. It does so, quickly.

In essence, it is a design workflow that consists of ideation, prototyping, and testing where the iterations take place in quick succession.

These prototypes are the central artefact of the process and provide a tangible context to assess the intended and unintended consequences of a proposed solution.

The benefits are best seen in practice:

How these prototypes are used indicates whether the solution is suitable and points in the right direction or whether we’ve missed the mark and need further iterations. Often, they lead to suggestions for better adaptions, or entirely new design directions.
The act of generating new solutions in near real-time materially changes what it means to design and shapes the types of solutions that are experimented with. Quantity has a quality all of its own.

Rapid prototyping has limits. And these limits are tested in designing AI prototypes.

Rapid prototyping breaks down under two conditions:

if prototype generation isn't fast
if the prototypes are not sufficiently representative of the end solution to test user behaviour.

Unfortunately, AI prototypes tend toward both error modes.

Speed

The speed of prototype generation and assessment allows for rapid exploration of the solution space.

Smaller investments in potential solutions allow more ideas to be tested. Importantly, reducing the upfront cost of testing ideas leads to less conservative, more outlandish ideas being tested. The Search Space is broadened.

At the same time, rapid prototypes inoculate against over-investment in unfruitful design directions. It is easy to throw something cheap away; sunk costs are smaller and so is the tendency to remain unhappily married to 'suboptimal’ ideas.

Accuracy

Deciding what to test with the AI prototype is important. Rapid prototyping is particularly unsuited to evaluating an idea where the performance of the AI model is a central, but uncertain component. In other words, if the product will be useful only if the performance exceeds a threshold of which we cannot be certain in advance.

Additionally, if judicious error-handling is fundamental to the product, then a prototype that doesn't address a representative range of errors will be invalid. This is particularly the case if certain long-tail errors are catastrophic resulting in a corrupted user experience and destroyed trust.

Users may love the happy path of a prototype omniscient AI, but if that is not what will ultimately be delivered, the prototype is little more than a toy.

— — —

What to do about it.

Deciding what to test in AI prototypes is central to a well-defined AI process. Splitting between a technical proof-of-concept that assesses performance and a user-focused prototype is a straightforward way to remove uncertainties around accuracy. Of course, this comes with a commitment to invest in the technical PoC prior to understanding user consequences.

To constrain the cost and time of the technical PoC, limiting it to a sub-section of the data may be useful. This PoC can then form the backend to a series of front-end rapid prototypes. Examples where this can work well include building for individual verticals of e-commerce sites, or a subset of use-cases in voice interactions.

Wizard of Oz prototypes are a favoured method of mocking up AI systems. To assess the impact of unexpected AI behaviours or algorithmic mistakes, purposefully including errors into the prototype is a good starting point. Listing out possible mistakes that the algorithm could make allows us to examine the types of reactions these errors can evoke in users. Of course, this will never be exhaustive and does not protect from unknown unknowns, but it is a substantial improvement from the over-optimistic AI magic that is typically tested.

— — —

To embed UX processes into the design of AI systems remains one of the central challenges to building human-centred AI products. There is an abundance of opportunity to build new tools to bridge the divide between designers and data scientists. And the products we put in the hands of our users.

Sources & inspiration

— — —

The Limits of Rapid Prototyping in AI

Speed

Accuracy

What to do about it.

The Experiential Futures Ladder

The Design Difficulties of Human-AI Interaction