The Watchlist Project: Building a Recommendation Engine for the Unexpected

Published on Dec 5, 2025

I've started work on something I've wanted for years but never found anywhere in the wild: a recommendation engine that doesn't just show me more of the same. I want something that nudges me toward films and TV shows I'd never normally pick. Not the popular stuff, not the algorithmically obvious stuff - something far more personal, and a little bit unexpected.

For now, I'm calling it the Watchlist Project.

The basic features are what you'd expect. I'll be logging what I've watched, what I'm watching, and what I plan to watch. That gives me the raw data to chew on. But that isn't the grand idea. The real goal is to build a system that can form a kind of "taste profile" from that data - not a demographic caricature ("you watched a sci-fi film, here are nine more sci-fi films") but a more nuanced sense of mood, tone, themes, pacing, emotional temperature, and the kinds of creative risks I tend to gravitate toward.

I want a system that can say:

"You gravitate toward character-driven stories with moral tension and a hint of cosmic unease. Try this obscure Norwegian gem from the 90s. Trust me."

The Core Idea: A Taste Profile That Actually Means Something

Instead of pigeonholing me into genres, the system will aim to capture things that matter more:

Mood and emotional temperature
Themes and ideas beneath the surface
Story structure and pacing
Creative risk and stylistic fingerprints
Patterns that aren't obvious but feel intuitive when surfaced

The goal isn't "more sci-fi because you watched sci-fi."
The goal is "here's something you never would have clicked on, but will absolutely enjoy."

The Tech Behind the Scenes

I spend most of my professional life in PHP and web development, so naturally that's where the backbone of this project starts. But I'll use whatever languages or tools are appropriate for the task at hand.

I'll be breaking the system architecture into the following stages.

Data Ingestion Layer

Pulling raw data from the following sources:

Manual entries (films watched, in progress, want to watch)
External APIs (TMDB primarily)
Optional metadata sources (Letterboxd CSV export, IMDb ratings, etc.)

The ingestion layer normalises:

Titles
IDs
Genre tags
Cast/crew information
Keywords
Release data

Think of it as a very opinionated ETL pipeline for metadata + user behaviour.

Feature Extraction

This is where the fun begins.

Potential features extracted per title:

Thematic embeddings (using LLM/ML to generate a semantic vector for plot, tone, pacing)
Crew signatures (directorial style, writer patterns)
Narrative archetypes (hero's journey, anti-structure, character-driven drama)
Genre/keyword weighting
Temporal features (era, runtime, production region)

Potential features extracted per user:

Taste-vector average (mean embedding of watched titles)
Taste clusters (distinct "modes" of taste detected via clustering)
Novelty tolerance (how far recommendations can deviate from known tastes)
Serendipity bias (how often obscure or low-visibility titles should be surfaced)
Diversity appetite (cross-genre, cross-era trends)

Recommendation Engine

This is the core algorithmic component. Some early approaches under consideration:

Vector similarity search across thematic embeddings (Cosine similarity on content embeddings rather than genre labels.)
Weighted matrix factorisation - Not for ratings - more for interest likelihood based on behavioural patterns.
"Friction Distance" Metric - A custom measure that defines how far a recommendation can safely stray from the user's tastes while still feeling relevant.
Clustering-based discovery - If I have distinct clusters (e.g., "slow sci-fi" and "British social realism"), the engine should occasionally bridge the gap with hybrid or unexpected picks.

This engine isn't trying to maximise watch-time. It aims for human-scale delight: the joy of discovering something that feels impossibly tailored yet surprising.

Presentation Layer

Initially a simple web app. Later:

Personalised watchlist management
A recommendation feed
Title exploration pages
Taste-profile visualisations
A transparent "why this recommendation?" breakdown
Maybe even a public API or data export

The interface is secondary to the idea, but still important. Discovery feels better when the UI feels like a curious companion rather than a corporate slot machine.

Why Build This Myself?

Because mainstream systems optimise for engagement, not discovery. Their goals aren't my goals.

I want recommendations that understand:

Why I love Waiting for Bojangles and Finding Nemo in different ways
That my interest in 90s sci-fi doesn't mean I want more nostalgia
That my "comfort watches" don't define me
That I crave strange, small, forgotten films just as much as the big blockbusters

The Watchlist Project is an attempt to build a tool that respects the weird knots of human taste.

Why I'm Sharing This

I want this blog to be more than a diary. By writing about the Watchlist Project publicly:

I hope to attract collaborators or like-minded developers who might want to join the journey.
I want to show my thought process and skills in a way that's concrete and project-based.
And I want to document the evolution of a tech project from idea to execution, with all the mistakes, experiments, and breakthroughs along the way.

If you're curious about building something similar, enjoy following tech projects, or just love films as much as I do, I hope you follow along.

Stay tuned. There's a lot to build - and a lot to watch.