Eric Le

case study

Jobtriage

Live agent triages Swedish job ads against any profile.

problem

Job boards rank for the platform's monetization, not the candidate's fit. Profile-driven match should be a first-class operation, not a retrieval-as-a-feature bolt-on.

system

Two postures share one agent shell. Same prompt, same tools, different data path.

agent shell
deploy
JobTech taxonomy and JobSearch APIs
local
SQLite corpus, hybrid retrieval stack

Deploy runs against the JobTech taxonomy and JobSearch APIs. Local CLI and local dev run the hybrid retrieval stack against a SQLite corpus.

Frontend
Next.js App Router on Vercel, Vercel AI SDK
Backend
FastAPI on Cloud Run europe-west1, 1Gi memory
Retrieval
BM25 + multilingual-e5-base dense + RRF over SQLite
BYOK
Anthropic, OpenAI, Gemini, local Ollama, mock replay
Domain
Cloudflare A record fronting Vercel

The mock-replay path lets a recruiter try the surface in five seconds. BYOK lets a technical visitor drive the agent with their own key. Most agent demos punt on both.

retrieval

50-query Swedish golden set against a 59-ad corpus from Spotify, Klarna, Volvo Group, Volvo Cars, Ericsson, HT Engineering, Stig Ericsson Bil, Montico, and Isaksson Rekrytering. Embeddings from intfloat/multilingual-e5-base.

hybrid retrieval ablation
configuration P@1 R@10 p95 ms
filter-only 0.020 0.150 0.0
bm25-only 0.680 0.920 1.2
dense-only 0.780 0.965 7.8
hybrid 0.720 0.950 15.2

Dense alone wins P@1 by 6 points over hybrid on this corpus. Hybrid earns its place on adversarial queries where exact keyword matches dominate (model names, employer jargon). An RRF score floor at JOBTRIAGE_RRF_FLOOR=0.025 suppresses low-relevance noise at the API boundary.

multilingual encoder comparison (dense)
encoder P@1 R@10 dim
MiniLM (en) 0.700 0.855 384
e5-base (ml) 0.780 0.965 768
e5-large (ml) 0.860 0.945 1024

English-only MiniLM loses 11 points of recall@10 against e5-base on the Swedish golden set. e5-large lifts P@1 by another 8 points over e5-base. The MiniLM dense numbers run slightly suppressed because the e5 prefix tokens it never trained on read as noise. The multilingual encoder choice was load-bearing, not aesthetic.

agent

Spatial tool pairings are pinned in the system prompt. After every data tool the agent fires a spatial tool. The canvas is the answer, not a decoration.

searchJobs
placeAds
triageBatch
groupAds
matchProfile
connectProfileToAds
compareRoles
pairAdsForCompare
deadlineWatch
placeAdsOnTimeline
trackStatus
markStatus

React Flow surfaces four canonical views: triage clusters, deadline timeline, side-by-side compare, and pinned shortlist. Custom nodes per view, not the React Flow defaults.