Stakeholder Feedback

Presentation

The presentation was highly well received, with stakeholders noting that the insights validated internal assumptions while clearly exposing deeper usability issues. The findings were described as valuable and reassuring, directly informing plans to redesign and simplify GT’s AI experiences.

Conclusion & Expected Impact

Study

The study found that GT’s AI tools lack discoverability and clarity, leading to confusion, high cognitive load, and a low SUS score. Users struggled to find AI entry points, understand how to generate content, and compare AI-generated edits with original text.


By consolidating AI tools into existing workflows, improving labeling and iconography, and introducing clearer content comparisons, these recommendations are expected to increase first-time task success, reduce friction, and build greater trust and adoption of AI features across the platform.

Insight + Recommendation

Side-by-side labeled versions of original text and AI-generated version with color coded identification of what was changed or added.

Participants had difficulty distinguishing the AI generated content and the original content.


Severity Rating: 4

Usability

FINDING 3:

The Problem:

Evidence:

Recommendation:

Gaze video of one of our participants clicking back and forth between the versions to try and spot the differences.

User quotes from participants given during the Retrospective Think Aloud.

FINDING 2:

The Problem:

Evidence:

CTA for “Generate with AI” is not easily discoverable.


Severity Rating: 3

Recommendation:

Combine the Generate with AI options with the + Add section on the right toolbar.

Insights + Recommendations

Discoverability

FINDING 1:

The Problem:

Evidence:

Recommendations:

While everyone was able to find the “Edit with AI” button, it took a long time, indicating low discoverability.


Severity Rating: 3

Streamline the editing flow by consolidating the editing tools into one menu and adjusting its position and icon to align with our users' mental models.

System Usability Scale (SUS)

At the end of each study, participants completed the SUS, generating a usability score ranging from 1 to 100. The System Usability Scale (SUS) is a 10-question survey that provides a quick, reliable measure of a product’s perceived usability. 

Finding

The AI tools suffer from low discoverability and a lack of systemic clarity, negatively impacting the overall experience of the platform.

Overall

Study Tasks

Eye-Tracking

Research

Overview

Study & Goals

Project

This study evaluated the usability and discoverability of GT’s AI tools, focusing on how easily

first-time users could find, understand, and complete core tasks. We identified key friction

points to improve clarity, efficiency, and user confidence.

Role: UX Researcher & Designer, Product Designer Duration: September-December 2025

Gutenberg Technology

Create a free website with Framer, the website builder loved by startups, designers and agencies.