← Back to archive
research, software · 2025

Assessing Personality Creatively

Can minigames and open-ended questions replace personality questionnaires? A master's thesis exploring gamified personality assessment through behavioural data and LLM analysis.

Context Leiden University · Master's Thesis
Year 2025
Type research, software

Overview

Personality assessment has relied on questionnaires for decades. They work reasonably well, but they have real problems: people answer based on how they see themselves rather than how they actually behave, long surveys lead to fatigue and careless answers, and the format itself can push people toward socially acceptable responses.

This thesis asks a simple question: what if we assessed personality through what people do instead of what they say about themselves?

The answer took the form of three minigames and three open-ended questions, all targeting the Big Five personality trait of conscientiousness. 51 participants completed the full experiment, which combined the custom-built tool with a traditional personality assessment (the IPIP-NEO-120) for comparison. Behavioural data was collected silently during the minigames, and open-ended answers were analysed using Claude as an LLM.

The short answer: the tool is not yet ready to replace traditional assessments. But the correlations found along the way are interesting and could lead to further serious game development of personality assessment.

The Experiment

The experiment had three parts. First, participants answered three open-ended questions framed as everyday scenarios, such as what they would do if money disappeared from their bank account, or whether they would still go for a walk if it might rain. The scenarios were designed to draw out behaviours tied to specific facets of conscientiousness without participants realising they were being assessed for it.

Next, they played three minigames, each designed around a different facet of conscientiousness. The minigames collected behavioural data in the background: click timestamps, hover durations, paths taken, books sorted, buttons pressed. None of this was visible to the participant.

Finally, they completed a standard 32-item conscientiousness questionnaire. This was the baseline everything else was compared against.

The open-ended answers were run through Claude, which scored each response across content, writing style, and terminology. The minigame data was processed with Python and Excel to extract behavioural variables. Then everything was correlated against the traditional assessment results.

The Minigames

Sorting Books
Participants were given a shelf and a pile of books, and asked to place at least a minimum number on the shelf. They could sort them however they liked, or not at all. The game tracked which books they picked up, in what order, how many times they rearranged them, and the final order they landed on.
The Maze
Five mazes to complete, each larger than the last. The twist: mazes three and five were impossible to solve. The game tracked how long participants kept trying before giving up, how many key presses they made in the impossible mazes, and how long they hovered over the skip button before clicking it.
Spin the Wheel
A chance wheel presenting five scenarios of increasing risk. Participants could spin and accept the outcome, or skip. Unknown to them, the outcome was already predetermined. The game tracked hovering patterns on both the spin and skip buttons, measuring hesitation and decision-making speed.

Some Fun Findings

How you sort books predicts how disciplined you are
Participants who sorted fewer books onto the shelf tended to score lower on self-discipline. The effect was strongest in the first level. Apparently, how much effort you put into an optional organising task says something real about your conscientiousness.
Height is the most natural way to sort
When sorting books, only height-based ordering correlated with conscientiousness. Sorting by colour or alphabetically showed no correlation at all. It seems that when people think about organising, height is the default logic they reach for.
The impossible maze reveals perseverance
The third and fifth mazes were designed to be unsolvable. More conscientious participants pressed fewer keys in these impossible mazes, particularly the last one. Rather than frantically trying every direction, they seemed to assess the situation more carefully before deciding to move on.
Simpler writing correlates with higher conscientiousness
This one was unexpected. It was assumed that more conscientious people would write in a more structured, stylistic way. The data said the opposite: participants with higher conscientiousness tended to write more simply and directly. The interpretation: conscientious people get to the point.
Hovering over a button tells you something
Participants who hovered less over the spin button on the wheel minigame tended to score higher on conscientiousness overall. On the last wheel scenario (walking past a seemingly drunk stranger late at night), those who hovered less over the spin button had higher self-efficacy. Less hesitation, more confidence in their own judgement.
Participants preferred the minigames, but were more honest in the survey
When asked afterwards, participants rated the minigames as more enjoyable and immersive than the traditional survey. But they also reported being more honest in the survey. The open-ended questions, because they involved role-playing scenarios, gave more room for people to present themselves differently than they actually are.

What It All Means

The tool built in this study is not yet reliable enough to replace traditional personality assessments. But that was somewhat expected for a first attempt. What it does show is that behaviour in minigames and the way people write carry real signals about personality, signals that questionnaires would never capture because they don't ask the right questions.

The most promising direction from here is refining the minigames, especially improving the theoretical link between specific game mechanics and the personality facets they are meant to measure. The LLM analysis also has room to improve with better prompting and more targeted questions.

The full thesis, data, code, and pre-registration are all available on the OSF page.

Built With

Serious Game Research JavaScript HTML CSS p5.js PHP Python Qualtrics Claude API AI Excel Statistics OSF Leiden University

Media

Assessing Personality Creatively