We talked to 13 UK government ministerial departments and focus groups revealed a common thread: The sheer volume of born-digital records (What are born-digital records?) produced by UK government departments is increasing exponentially the risks of sensitive data being inadvertently published online.
Our research team led a GOV.UK user Discovery which validated a need for an "AI Assistant" which can help derisk the publishing process process. But validating the exact functionality of such AI tool proved more challenging. How do we get we across the exact capabilities of such AI system, and its limitations, without going into technical detail?
Here is how no-code prototyping allowed us to effectively communicate the functionality of a fairly sophisticated system, so we could quickly validate and prioritise features before investing time in development.
On a high-level, an AI Assistant would highlight potential sensitivities in records transferred to The National Archives (TNA). As it is the case with every product, we had a long list of features which we had to priorities – what exactly do we want the AI Assistant to do, what are the must-haves, and what are the nice-to-have features.
The Public Records Act UK requires Alex to review the sensitivity of records which bear historical value and preserve them for their ultimate transfer to The National Archives. Alex’s team strives to keep abreast in a digital-first world where born-digital records are becoming the norm. Twenty years ago, it would have been hard for Alex and her team to foresee how their role today would require them to decide the fate of more than 6 million born-digital records. The volume of digital records created every day has made manual review an impossible task today. The situation applies to practically every major UK central government departments.
A conscious sensitivity review is fundamental to managing risks ranging from confidential information disclosure to safeguarding individuals' safety. AI has the potential to assist in managing these risks, yet, government users expressed uncertainty, often mentioning “we would need to see what it’s able to do”.
Sensitive information can be accidentally disclosed in open records in different ways (Twenty-year rule on public records). In addition to closed records which contain sensitive information, open records may contain Personal Identifiable Information (PII) which has to be redacted before a record can be accessed by the general public. We also found more complex cases where different departments hold the same documents but have conflicting redactions and closure status.
Assume Alex’s department has to respond to a public inquiry about "Buttermilk Biscuits" which also involves two other departments. Each department holds copies and modifications of the same records. When preparing for transfer, each department also follows a slightly different sensitivity review process. As a result, Buttermilk Biscuits case records would be transferred to The National Archives with different redactions and closure status. Managing the risks become infeasible at scale, manually processing and coordinating between departments.
We proposed a concept for an AI tool that scans records across TNA’s digital collection to identify potential PII and Similarities found within records. So, how does it work? Put simply, users would log into a platform that provides a list of actionable suggestions. Users can then review suggestions and take measures such as requesting record takedown and reclosure when relevant, preventing risks associated with widespread access. Building the web application just to show users ‘what it’s able to do’ would be a wasteful use of resources. Instead, we developed a no-code prototype and tested our riskiest assumptions to determine:
Again, let’s assume Alex recently transferred a collection of born-digital records and wants to double-check that no sensitive information slipped through the net. The front-end user experience might include:
First thing, users land to the AI Assistant page. We asked, what is an AI Assistant? The majority of users had a fair guess of its meaning; at this point users are familiar with the problem, not much with the service features. After a few prototype iterations, we incorporated a description landing page following GOV.UK Design System where users can find a simple, clear and quick content description. Once users understand what’s achievable through the service, they can proceed to select the suggestions type. For example, some users expressed awareness that another department recently transferred a record series on a shared subject and they want to review Similar Records suggestions.
We had to anticipate that with millions of to-be-transferred born-digital records, the AI would possibly be picking up a lot of suggestions with different confidence scores. How do you explain to users what a confidence score is? Is it even relevant? We split the AI suggestions by ‘Priority’ and ‘All Suggestions’ with collapsible lists; ‘Priority’ contains the elements with the highest confidence score. By prioritising suggestions, users were able to focus on actionable items and quickly scan through record title and description to seize suggestions that ring some bells within the context of the department’s sensitivities.
The AI can make mistakes. We wanted to enable a two-way human-ai collaboration in which users decide whether a suggestion is actionable and the AI learns from users’ input to improve its accuracy. For example, Alex gets a list of suggestions related to the recently transferred series in the subject “pineapple on pizza”. In some contexts, the AI may pick up those keywords as sensitive, however, in the context of Alex's department, that specific series is not considered sensitive. In that case, the user can filter by ‘date transferred’, batch-select the records in the series and dismiss the suggestions, as opposed to reviewing one by one.
The user interface is also crafted for speed and clarity, allowing users to dive directly into record review based on information easily recognised by users such as reference, title, date created and description. Alternatively, users can open ‘Details’ to get the full record information without leaving the page.
Alex has chosen to review a suggestion. They found sensitivities in an open record and want to put forward a takedown request to TNA. In other cases, Alex may just need to redact the sensitive parts and publish a redacted version of the original record. Either way, users perceived value in quickly screening potential errors and fixing them before any confidential information gets unintentionally disclosed or held by mal-intent parties.
We got users to ‘talk AI' by effectively communicating the proposed user experience of a human-ai collaboration platform with no-code prototypes. Users actively engaged in suggesting tools to improve the service’s interaction experience without any prior knowledge or technical background in AI.
It then became an interesting learning experience how users could ‘talk AI’ from their personal experience and how we can shape the service interaction with absolutely no code involved.