We're flooding organizations with AI tools while measuring the wrong outcomes.
Two studies reveal the same finding. Researchers tracked designers and tested students. In both groups, our metrics capture activity while missing transformation. We count tool usage, clicks, and completion times. We measure knowledge acquisition and feature adoption. None of these predict whether humans get better at thinking.
Designers viewing AI suggestions without selecting them outperform those who never see them. Students memorizing AI terminology underperform those building collaborative judgment. In both cases, the visible metrics (selections made, concepts learned) tell us nothing about whether people can solve harder problems.
This isn't a measurement refinement problem. It's a category error. We're using industrial-age metrics to evaluate intelligence-age tools that change how people think. We're not measuring thinking at all.
From Metrics to Meaning: Why We Measuring Human-AI Collaboration Wrong
We've been counting clicks when we should have been measuring minds.
The Discovery
A field study of 808 designers revealed something unexpected: users who simply viewed AI-generated design galleries, without selecting anything, spent 22.5 minutes per session on design tasks. Users who never saw the galleries spent 9.4 minutes. Same task. Same goal. One group invested 140% more time and produced better outcomes.
Why does viewing matter more than clicking?
Because we've been measuring wrong. Current research counts actions: clicks, edits, selections, rejections. These metrics assume value only exists where we can count interactions. A designer studying an AI suggestion might never select it yet still absorb its core principle, fundamentally altering their creative direction without leaving a trace.
How AI Really Changes Work
Designers using AI galleries spent longer on tasks and produced significantly better work. When researchers controlled for time in a lab study, quality improvements disappeared. Time was the mechanism. AI doesn't accelerate completion. It increases engagement, which increases investment, which improves outcomes.
Participants described galleries as "making me think harder" and "showing possibilities I'd want to explore." Exposure to diverse, high-quality examples raised their aspirations and willingness to invest effort.
AI functions as a creative catalyst.
Why Previous Research Contradicted Itself
Engagement operates on three independent axes: cognitive, emotional, and behavioral. Studies measuring only one axis produce contradictory results.
The lab study showed this dramatically. One participant spent significant time viewing an intelligent algorithm's gallery and made numerous selections from it, yet reported preferring the random control gallery. Another made zero selections and described the experience as "changing how I thought about the problem."
Humans process AI suggestions through mental models that don't map onto interface interactions. A single viewed suggestion can spark reasoning chains that influence dozens of subsequent decisions, none traceable to the original stimulus.
What This Means
Algorithms are core UX components that reshape how designers think, not backend optimizers.
Current evaluation (clicks, time-on-task, completion rates) captures behavioral residue, not cognitive transformation. Researchers need attention patterns, strategy shifts, and mental model evolution throughout the creative process.
The Real Question
We've been asking: "How often do users interact with AI suggestions?"
We should ask: "How does AI presence change how users think?"
The answer determines whether AI becomes a tool that replaces human judgment or amplifies human creativity. This study demonstrates the latter, if we measure what matters.
Source: https://dl.acm.org/doi/pdf/10.1145/3773292
The AI Education Illusion
Students who know more about AI think less effectively with it
Chinese researchers tested 1,466 university students and discovered a critical finding for AI training design: teaching people about AI doesn't improve their ability to think with it. Factual knowledge actually interferes with problem-solving ability.
The study measured two things: how much students knew about AI concepts, and how well they could solve problems computationally. Students who scored high on factual knowledge showed weaker problem-solving ability once researchers controlled for other factors. The knowledge itself contributed nothing. It seemed to crowd out what actually mattered.
What separates competent users from knowledgeable ones
Two capabilities determined success: abstract reasoning about AI systems, and the ability to collaborate with them on real tasks. Students strong in these areas excelled at computational challenges. Students who'd memorized AI terminology struggled.
The gap appears in how students spend their time. Using ChatGPT for four hours doesn't improve thinking. What students do during those hours does. The researchers found that tool usage improved thinking only when it first built literacy, which then enabled problem-solving. Break that chain anywhere and the hours produce nothing.
The mediation effect that changes everything
Consider the numbers: 78% of the benefit from using AI tools flowed through literacy development. The direct path from tools to thinking contributed almost nothing. Universities buying site licenses and declaring victory on AI integration are measuring the wrong outcome.
Engineering students excelled at algorithms. Liberal arts students collaborated better with AI. Neither group showed dramatic advantages overall. Urban students outperformed rural ones by margins too small to matter practically. The disciplinary and demographic gaps educators worry about explain less than 1% of variance in outcomes.
How to fix the curriculum
The solution requires a different curriculum structure. Every AI interaction needs three phases: learn the concept, apply it to an authentic problem, then defend your choices publicly. The researchers recommend Demo Days where students explain their prompt engineering decisions, or team structures that assign explicit roles. One person writes prompts, another validates outputs, a third reviews ethics.
The architecture matters because most AI education reverses the sequence. Institutions teach concepts, then hope application follows naturally. It doesn't. Students accumulate terminology while their problem-solving capacity stagnates.
Why this matters beyond campus
The finding carries weight beyond education. Knowledge transfer fails in controlled university settings. Organizations deploying AI training for employees face the same problem: hours invested, competencies unchanged, metrics reported as success.
Universities face a choice. They can integrate AI everywhere and produce students who can define large language models without using them effectively. They can build literacy scaffolds that turn tool time into thinking time. The difference shows up not in usage statistics but in what students can actually do when assessment arrives.
The research suggests that most todays AI trainings are choosing the first path without realizing it. They're teaching vocabulary when they should be building judgment.
Source: https://www.nature.com/articles/s41598-025-26888-z
The Literacy Were Not Building
Both studies reveal why AI training programs fail.
Transformation happens between stimulus and action. We measure only the endpoints. Employees complete courses but can't apply tools effectively. Training dashboards count completions while missing whether people absorbed principles that changed their approach.
AI literacy requires instrumenting cognition. Track attention patterns and strategy shifts, then assess what employees can do, not what they clicked. Standard learning management systems treat training as content delivery, not cognitive intervention.
What works: structured reflection after each interaction, collaboration protocols that assign explicit roles, defense of reasoning chains. This requires more than video modules and quiz completion.
They're building vocabulary when they should be building judgment.
The question isn't whether AI transforms how employees think. The question is whether training programs will deliver the transformation or just the transaction.
Until next time, Matthias
